5,460 research outputs found
Exploring missing heritability in neurodevelopmental disorders:Learning from regulatory elements
In this thesis, I aimed to solve part of the missing heritability in neurodevelopmental disorders, using computational approaches. Next to the investigations of a novel epilepsy syndrome and investigations aiming to elucidate the regulation of the gene involved, I investigated and prioritized genomic sequences that have implications in gene regulation during the developmental stages of human brain, with the goal to create an atlas of high confidence non-coding regulatory elements that future studies can assess for genetic variants in genetically unexplained individuals suffering from neurodevelopmental disorders that are of suspected genetic origin
Flood dynamics derived from video remote sensing
Flooding is by far the most pervasive natural hazard, with the human impacts of floods expected to worsen in the coming decades due to climate change. Hydraulic models are a key tool for understanding flood dynamics and play a pivotal role in unravelling the processes that occur during a flood event, including inundation flow patterns and velocities. In the realm of river basin dynamics, video remote sensing is emerging as a transformative tool that can offer insights into flow dynamics and thus, together with other remotely sensed data, has the potential to be deployed to estimate discharge. Moreover, the integration of video remote sensing data with hydraulic models offers a pivotal opportunity to enhance the predictive capacity of these models.
Hydraulic models are traditionally built with accurate terrain, flow and bathymetric data and are often calibrated and validated using observed data to obtain meaningful and actionable model predictions. Data for accurately calibrating and validating hydraulic models are not always available, leaving the assessment of the predictive capabilities of some models deployed in flood risk management in question. Recent advances in remote sensing have heralded the availability of vast video datasets of high resolution. The parallel evolution of computing capabilities, coupled with advancements in artificial intelligence are enabling the processing of data at unprecedented scales and complexities, allowing us to glean meaningful insights into datasets that can be integrated with hydraulic models. The aims of the research presented in this thesis were twofold. The first aim was to evaluate and explore the potential applications of video from air- and space-borne platforms to comprehensively calibrate and validate two-dimensional hydraulic models. The second aim was to estimate river discharge using satellite video combined with high resolution topographic data. In the first of three empirical chapters, non-intrusive image velocimetry techniques were employed to estimate river surface velocities in a rural catchment. For the first time, a 2D hydraulicvmodel was fully calibrated and validated using velocities derived from Unpiloted Aerial Vehicle (UAV) image velocimetry approaches. This highlighted the value of these data in mitigating the limitations associated with traditional data sources used in parameterizing two-dimensional hydraulic models. This finding inspired the subsequent chapter where river surface velocities, derived using Large Scale Particle Image Velocimetry (LSPIV), and flood extents, derived using deep neural network-based segmentation, were extracted from satellite video and used to rigorously assess the skill of a two-dimensional hydraulic model. Harnessing the ability of deep neural networks to learn complex features and deliver accurate and contextually informed flood segmentation, the potential value of satellite video for validating two dimensional hydraulic model simulations is exhibited. In the final empirical chapter, the convergence of satellite video imagery and high-resolution topographical data bridges the gap between visual observations and quantitative measurements by enabling the direct extraction of velocities from video imagery, which is used to estimate river discharge. Overall, this thesis demonstrates the significant potential of emerging video-based remote sensing datasets and offers approaches for integrating these data into hydraulic modelling and discharge estimation practice. The incorporation of LSPIV techniques into flood modelling workflows signifies a methodological progression, especially in areas lacking robust data collection infrastructure. Satellite video remote sensing heralds a major step forward in our ability to observe river dynamics in real time, with potentially significant implications in the domain of flood modelling science
Explainable Artificial Intelligence Methods in FinTech Applications
The increasing amount of available data and access to high-performance computing allows companies to use complex Machine Learning (ML) models for their decision-making process, so-called âblack-boxâ models. These âblack-boxâ models typically show higher predictive accuracy than linear models on complex data sets. However, this improved predictive accuracy can only be achieved by deteriorating the explanatory power. âOpen the black boxâ and make the model predictions explainable is summarised under the research area of Explainable Artificial Intelligence (XAI). Using black-box models also raises practical and ethical issues, especially in critical industries such as finance. For this reason, the explainability of models is increasingly becoming a focus for regulators. Applying XAI methods to ML models makes their predictions explainable and hence, enables the application of ML models in the financial industries. The application of ML models increases predictive accuracy and supports the different stakeholders in the financial industries in their decision-making processes.
This thesis consists of five chapters: a general introduction, a chapter on conclusions and future research, and three separate chapters covering the underlying papers. Chapter 1 proposes an XAI method that can be used in credit risk management, in particular, in measuring the risks associated with borrowing through peer-to-peer lending platforms. The model applies correlation networks to Shapley values and thus the model predictions are grouped according to the similarity of the underlying explanations. Chapter 2 develops an alternative XAI method based on the Lorenz Zonoid approach. The new method is statistically normalised and can therefore be used as a standard for the application of Artificial Intelligence (AI) in credit risk management. The novel âShapley-Lorenzâ-approach can facilitate the validation of model results and supports the decision whether a model is sufficiently explained. In Chapter 3, an XAI method is applied to assess the impact of financial and non-financial factors on a firmâs ex-ante cost of capital, a measure that reflects investorsâ perceptions of a firmâs risk appetite. A combination of two explanatory
tools: the Shapley values and the Lorenz model selection approach, enabled the identification of the most important features and the reduction of the independent features. This allowed a substantial simplification of the model without a statistically significant decrease in predictive accuracy.The increasing amount of available data and access to high-performance computing allows companies to use complex Machine Learning (ML) models for their decision-making process, so-called âblack-boxâ models. These âblack-boxâ models typically show higher predictive accuracy than linear models on complex data sets. However, this improved predictive accuracy can only be achieved by deteriorating the explanatory power. âOpen the black boxâ and make the model predictions explainable is summarised under the research area of Explainable Artificial Intelligence (XAI). Using black-box models also raises practical and ethical issues, especially in critical industries such as finance. For this reason, the explainability of models is increasingly becoming a focus for regulators. Applying XAI methods to ML models makes their predictions explainable and hence, enables the application of ML models in the financial industries. The application of ML models increases predictive accuracy and supports the different stakeholders in the financial industries in their decision-making processes.
This thesis consists of five chapters: a general introduction, a chapter on conclusions and future research, and three separate chapters covering the underlying papers. Chapter 1 proposes an XAI method that can be used in credit risk management, in particular, in measuring the risks associated with borrowing through peer-to-peer lending platforms. The model applies correlation networks to Shapley values and thus the model predictions are grouped according to the similarity of the underlying explanations. Chapter 2 develops an alternative XAI method based on the Lorenz Zonoid approach. The new method is statistically normalised and can therefore be used as a standard for the application of Artificial Intelligence (AI) in credit risk management. The novel âShapley-Lorenzâ-approach can facilitate the validation of model results and supports the decision whether a model is sufficiently explained. In Chapter 3, an XAI method is applied to assess the impact of financial and non-financial factors on a firmâs ex-ante cost of capital, a measure that reflects investorsâ perceptions of a firmâs risk appetite. A combination of two explanatory
tools: the Shapley values and the Lorenz model selection approach, enabled the identification of the most important features and the reduction of the independent features. This allowed a substantial simplification of the model without a statistically significant decrease in predictive accuracy
Online semi-supervised learning in non-stationary environments
Existing Data Stream Mining (DSM) algorithms assume the availability of labelled and
balanced data, immediately or after some delay, to extract worthwhile knowledge from the
continuous and rapid data streams. However, in many real-world applications such as
Robotics, Weather Monitoring, Fraud Detection Systems, Cyber Security, and Computer
Network Traffic Flow, an enormous amount of high-speed data is generated by Internet of
Things sensors and real-time data on the Internet. Manual labelling of these data streams
is not practical due to time consumption and the need for domain expertise. Another
challenge is learning under Non-Stationary Environments (NSEs), which occurs due to
changes in the data distributions in a set of input variables and/or class labels. The problem
of Extreme Verification Latency (EVL) under NSEs is referred to as Initially Labelled Non-Stationary Environment (ILNSE). This is a challenging task because the learning algorithms
have no access to the true class labels directly when the concept evolves. Several approaches
exist that deal with NSE and EVL in isolation. However, few algorithms address both issues
simultaneously. This research directly responds to ILNSEâs challenge in proposing two
novel algorithms âPredictor for Streaming Data with Scarce Labelsâ (PSDSL) and
Heterogeneous Dynamic Weighted Majority (HDWM) classifier. PSDSL is an Online Semi-Supervised Learning (OSSL) method for real-time DSM and is closely related to label
scarcity issues in online machine learning.
The key capabilities of PSDSL include learning from a small amount of labelled data in an
incremental or online manner and being available to predict at any time. To achieve this,
PSDSL utilises both labelled and unlabelled data to train the prediction models, meaning it
continuously learns from incoming data and updates the model as new labelled or
unlabelled data becomes available over time. Furthermore, it can predict under NSE
conditions under the scarcity of class labels. PSDSL is built on top of the HDWM classifier,
which preserves the diversity of the classifiers. PSDSL and HDWM can intelligently switch
and adapt to the conditions. The PSDSL adapts to learning states between self-learning,
micro-clustering and CGC, whichever approach is beneficial, based on the characteristics of
the data stream. HDWM makes use of âseedâ learners of different types in an ensemble to
maintain its diversity. The ensembles are simply the combination of predictive models
grouped to improve the predictive performance of a single classifier.
PSDSL is empirically evaluated against COMPOSE, LEVELIW, SCARGC and MClassification
on benchmarks, NSE datasets as well as Massive Online Analysis (MOA) data streams and real-world datasets. The results showed that PSDSL performed significantly better than
existing approaches on most real-time data streams including randomised data instances.
PSDSL performed significantly better than âStaticâ i.e. the classifier is not updated after it is
trained with the first examples in the data streams. When applied to MOA-generated data
streams, PSDSL ranked highest (1.5) and thus performed significantly better than SCARGC,
while SCARGC performed the same as the Static. PSDSL achieved better average prediction
accuracies in a short time than SCARGC.
The HDWM algorithm is evaluated on artificial and real-world data streams against existing
well-known approaches such as the heterogeneous WMA and the homogeneous Dynamic
DWM algorithm. The results showed that HDWM performed significantly better than WMA
and DWM. Also, when recurring concept drifts were present, the predictive performance of
HDWM showed an improvement over DWM. In both drift and real-world streams,
significance tests and post hoc comparisons found significant differences between
algorithms, HDWM performed significantly better than DWM and WMA when applied to
MOA data streams and 4 real-world datasets Electric, Spam, Sensor and Forest cover. The
seeding mechanism and dynamic inclusion of new base learners in the HDWM algorithms
benefit from the use of both forgetting and retaining the models. The algorithm also
provides the independence of selecting the optimal base classifier in its ensemble depending
on the problem.
A new approach, Envelope-Clustering is introduced to resolve the cluster overlap conflicts
during the cluster labelling process. In this process, PSDSL transforms the centroidsâ
information of micro-clusters into micro-instances and generates new clusters called
Envelopes. The nearest envelope clusters assist the conflicted micro-clusters and
successfully guide the cluster labelling process after the concept drifts in the absence of true
class labels. PSDSL has been evaluated on real-world problem âkeystroke dynamicsâ, and
the results show that PSDSL achieved higher prediction accuracy (85.3%) and SCARGC
(81.6%), while the Static (49.0%) significantly degrades the performance due to changes in
the users typing pattern. Furthermore, the predictive accuracies of SCARGC are found
highly fluctuated between (41.1% to 81.6%) based on different values of parameter âkâ
(number of clusters), while PSDSL automatically determine the best values for this
parameter
UMSL Bulletin 2023-2024
The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp
Multidisciplinary perspectives on Artificial Intelligence and the law
This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (âAIâ) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics â and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the CatĂłlica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio
Evolutionary ecology of obligate fungal and microsporidian invertebrate pathogens
The interactions between hosts and their parasites and pathogens are omnipresent in the natural world. These symbioses are not only key players in ecosystem functioning, but also drive genetic diversity through co-evolutionary adaptations. Within the speciose invertebrates, a plethora of interactions with obligate fungal and microsporidian pathogens exist, however the known interactions is likely only a fraction of the true diversity. Obligate invertebrate fungal and microsporidian pathogen require a host to continue their life cycle, some of which have specialised in certain host species and require host death to transmit to new hosts. Due to their requirement to kill a host to spread to a new one, obligate fungal and microsporidian pathogens regulate invertebrate host populations. Pathogen specialisation to a single or very few hosts has led to some fungi evolving the ability to manipulate their hostâs behaviour to maximise transmission. The entomopathogenic fungus, Entomophthora muscae, infects houseflies (Musca domestica) over a week-long proliferation cycle, resulting in flies climbing to elevated positions, gluing their mouthparts to the substrate surface, and raising their wings to allow for a clear exit from fungal conidia through the host abdomen. These sequential behaviours are all timed to occur within a few hours of sunset. The E. muscae mechanisms used in controlling the mind of the fly remain relatively unknown, and whether other fitness costs ensue from an infection are understudied.European Commissio
Sound Event Detection by Exploring Audio Sequence Modelling
Everyday sounds in real-world environments are a powerful source of information by which humans can interact with their environments. Humans can infer what is happening around them by listening to everyday sounds. At the same time, it is a challenging task for a computer algorithm in a smart device to automatically recognise, understand, and interpret everyday sounds. Sound event detection (SED) is the process of transcribing an audio recording into sound event tags with onset and offset time values. This involves classification and segmentation of sound events in the given audio recording. SED has numerous applications in everyday life which include security and surveillance, automation, healthcare monitoring, multimedia information retrieval, and assisted living technologies. SED is to everyday sounds what automatic speech recognition (ASR) is to speech and automatic music transcription (AMT) is to music. The fundamental questions in designing a sound recognition system are, which portion of a sound event should the system analyse, and what proportion of a sound event should the system process in order to claim a confident detection of that particular sound event. While the classification of sound events has improved a lot in recent years, it is considered that the temporal-segmentation of sound events has not improved in the same extent. The aim of this thesis is to propose and develop methods to improve the segmentation and classification of everyday sound events in SED models. In particular, this thesis explores the segmentation of sound events by investigating audio sequence encoding-based and audio sequence modelling-based methods, in an effort to improve the overall sound event detection performance. In the first phase of this thesis, efforts are put towards improving sound event detection by explicitly conditioning the audio sequence representations of an SED model using sound activity detection (SAD) and onset detection. To achieve this, we propose multi-task learning-based SED models in which SAD and onset detection are used as auxiliary tasks for the SED task. The next part of this thesis explores self-attention-based audio sequence modelling, which aggregates audio representations based on temporal relations within and between sound events, scored on the basis of the similarity of sound event portions in audio event sequences. We propose SED models that include memory-controlled, adaptive, dynamic, and source separation-induced self-attention variants, with the aim to improve overall sound recognition
- âŠ