7,094 research outputs found

    Opportunities and risks of stochastic deep learning

    Get PDF
    This thesis studies opportunities and risks associated with stochasticity in deep learning that specifically manifest in the context of adversarial robustness and neural architecture search (NAS). On the one hand, opportunities arise because stochastic methods have a strong impact on robustness and generalisation, both from a theoretical and an empirical standpoint. In addition, they provide a framework for navigating non-differentiable search spaces, and for expressing data and model uncertainty. On the other hand, trade-offs (i.e., risks) that are coupled with these benefits need to be carefully considered. The three novel contributions that comprise the main body of this thesis are, by these standards, instances of opportunities and risks. In the context of adversarial robustness, our first contribution proves that the impact of an adversarial input perturbation on the output of a stochastic neural network (SNN) is theoretically bounded. Specifically, we demonstrate that SNNs are maximally robust when they achieve weight-covariance alignment, i.e., when the vectors of their classifier layer are aligned with the eigenvectors of that layer's covariance matrix. Based on our theoretical insights, we develop a novel SNN architecture with excellent empirical adversarial robustness and show that our theoretical guarantees also hold experimentally. Furthermore, we discover that SNNs partially owe their robustness to having a noisy loss landscape. Gradient-based adversaries find this landscape difficult to ascend during adversarial perturbation search, and therefore fail to create strong adversarial examples. We show that inducing a noisy loss landscape is not an effective defence mechanism, as it is easy to circumvent. To demonstrate that point, we develop a stochastic loss-smoothing extension to state-of-the-art gradient-based adversaries that allows them to attack successfully. Interestingly, our loss-smoothing extension can also (i) be successful against non-stochastic neural networks that defend by altering their loss landscape in different ways, and (ii) strengthen gradient-free adversaries. Our third and final contribution lies in the field of few-shot learning, where we develop a stochastic NAS method for adapting pre-trained neural networks to previously unseen classes, by observing only a few training examples of each new class. We determine that the adaptation of a pre-trained backbone is not as simple as adapting all of its parameters. In fact, adapting or fine-tuning the entire architecture is sub-optimal, as a lot of layers already encode knowledge optimally. Our NAS algorithm searches for the optimal subset of pre-trained parameters to be adapted or fine-tuned, which yields a significant improvement over the existing paradigm for few-shot adaptation

    Analysing behavioural factors that impact financial stock returns. The case of COVID-19 pandemic in the financial markets.

    Get PDF
    This thesis represents a pivotal advancement in the realm of behavioural finance, seamlessly integrating both classical and state-of-the-art models. It navigates the performance and applicability of the Irrational Fractional Brownian Motion (IFBM) model, while also delving into the propagation of investor sentiment, emphasizing the indispensable role of hands-on experiences in understanding, applying, and refining complex financial models. Financial markets, characterized by ’fat tails’ in price change distributions, often challenge traditional models such as the Geometric Brownian Motion (GBM). Addressing this, the research pivots towards the Irrational Fractional Brownian Motion Model (IFBM), a groundbreaking model initially proposed by (Dhesi and Ausloos, 2016) and further enriched in (Dhesi et al., 2019). This model, tailored to encapsulate the ’fat tail’ behaviour in asset returns, serves as the linchpin for the first chapter of this thesis. Under the insightful guidance of Gurjeet Dhesi, a co-author of the IFBM model, we delved into its intricacies and practical applications. The first chapter aspires to evaluate the IFBM’s performance in real-world scenarios, enhancing its methodological robustness. To achieve this, a tailored algorithm was crafted for its rigorous testing, alongside the application of a modified Chi-square test for stability assessment. Furthermore, the deployment of Shannon’s entropy, from an information theory perspective, offers a nuanced understanding of the model. The S&P500 data is wielded as an empirical testing bed, reflecting real-world financial market dynamics. Upon confirming the model’s robustness, the IFBM is then applied to FTSE data during the tumultuous COVID-19 phase. This period, marked by extraordinary market oscillations, serves as an ideal backdrop to assess the IFBM’s capability in tracking extreme market shifts. Transitioning to the second chapter, the focus shifts to the potentially influential realm of investor sentiment, seen as one of the many factors contributing to fat tails’ presence in return distributions. Building on insights from (Baker and Wurgler, 2007), we examine the potential impact of political speeches and daily briefings from 10 Downing Street during the COVID-19 crisis on market sentiment. Recognizing the profound market impact of such communications, the chapter seeks correlations between these briefings and market fluctuations. Employing advanced Natural Language Processing (NLP) techniques, this chapter harnesses the power of the Bidirectional Encoder Representations from Transformers (BERT) algorithm (Devlin et al., 2018) to extract sentiment from governmental communications. By comparing the derived sentiment scores with stock market indices’ performance metrics, potential relationships between public communications and market trajectories are unveiled. This approach represents a melding of traditional finance theory with state-of-the-art machine learning techniques, offering a fresh lens through which the dynamics of market behaviour can be understood in the context of external communications. In conclusion, this thesis provides an intricate examination of the IFBM model’s performance and the influence of investor sentiment, especially under crisis conditions. This exploration not only advances the discourse in behavioural finance but also underscores the pivotal role of sophisticated models in understanding and predicting market trajectories

    On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse

    Get PDF
    This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse

    Multi-epoch machine learning for galaxy formation

    Get PDF
    In this thesis I utilise a range of machine learning techniques in conjunction with hydrodynamical cosmological simulations. In Chapter 2 I present a novel machine learning method for predicting the baryonic properties of dark matter only subhalos taken from N-body simulations. The model is built using a tree-based algorithm and incorporates subhalo properties over a wide range of redshifts as its input features. I train the model using a hydrodynamical simulation which enables it to predict black hole mass, gas mass, magnitudes, star formation rate, stellar mass, and metallicity. This new model surpasses the performance of previous models. Furthermore, I explore the predictive power of each input property by looking at feature importance scores from the tree-based model. By applying the method to the LEGACY N-body simulation I generate a large volume mock catalog of the quasar population at z=3. By comparing this mock catalog with observations, I demonstrate that the IllustrisTNG subgrid model for black holes is not accurately capturing the growth of the most massive objects. In Chapter 3 I apply my method to investigate the evolution of galaxy properties in different simulations, and in various environments within a single simulation. By comparing the Illustris, EAGLE, and TNG simulations I show that subgrid model physics plays a more significant role than the choice of hydrodynamics method. Using the CAMELS simulation suite I consider the impact of cosmological and astrophysical parameters on the buildup of stellar mass within the TNG and SIMBA models. In the final chapter I apply a combination of neural networks and symbolic regression methods to construct a semi-analytic model which reproduces the galaxy population from a cosmological simulation. The neural network based approach is capable of producing a more accurate population than a previous method of binning based on halo mass. The equations resulting from symbolic regression are found to be a good approximation of the neural network

    Human gut microbes’ transmission, persistence, and contribution to lactose tolerance

    Get PDF
    Human genotypes and their environment interact to produce selectable phenotypes. How microbes of the human gut microbiome interact with their host genotype to shape phenotype is not fully understood. Microbiota that inhabit the human body are environmentally acquired, yet many are passed intergenerationally between related family members, raising the possibility that they could act like genes. Here, I present three studies aimed at better understanding how certain gut microbiota contribute to host phenotypes. In a first study, I assessed mother to child transmission in understudied populations. I collected stool samples from 386 mother-infant pairs in Gabon and Vietnam, which are relatively under-studied for microbiome dynamics, and in Germany. Using metagenomic sequencing I characterized microbial strain diversity. I found that 25-50% of strains detected in mother-infant pairs were shared, and that strain-sharing between unrelated individuals was rare overall. These observations indicate that vertical transmission of microbes is widespread in human populations. Second, to test whether strains acquired during infancy persist into adulthood (similar to human genes), I collected stool from an adolescent previously surveyed for microbiome diversity as an infant. This dataset represents the longest follow-up to date for the persistence of strains seeded in infancy. I observed two strains that had persisted in the gut despite over 10 years passing, as well as 5 additional strains shared between the subject and his parents. Taken together, the results of these first two studies suggest that gut microbial strains persist throughout life and transmit between host-generations, dynamics more similar to those of the host’s own genome than of their environment. Third, I tested whether gut microbes could confer a phenotype (lactose tolerance) to individuals lacking the necessary genotypes (lactase persistence). I studied 784 women in Gabon, Vietnam and Germany for lactase persistence (genotype), lactose tolerance (phenotype), and characterized their gut microbiomes through metagenomic sequencing. Despite the genotype, I observed that 13% of participants were lactose tolerant by clinical criteria; I termed this novel phenotype microbially-acquired lactose tolerance (MALT). Those with MALT harbored microbiomes enriched for Bifidobacteria, a known lactose degrader. These results indicate that Bifidobacteria - which is passed intergenerationally - can confer a phenotype previously thought to be under only host genetic control. Taken together, my thesis work lends weight to the concept that specific microbes inhabiting the human gut have the potential to behave as epigenetic factors in evolution

    Full-scale modal testing of a Hawk T1A aircraft for benchmarking vibration-based methods

    Get PDF
    Research developments for structural dynamics in the fields of design, system identification and structural health monitoring (SHM) have dramatically expanded the bounds of what can be learned from measured vibration data. However, significant challenges remain in the tasks of identification, prediction and evaluation of full-scale structures. A significant aid in the roadmap to the application of cutting-edge methods to the demands of in-service engineering structures, is the development of comprehensive benchmark datasets. With the aim of developing a useful and worthwhile benchmark dataset for structural dynamics, an extensive testing campaign is presented here. This recent campaign was performed on a decommissioned BAE system Hawk T1A aircraft at the Laboratory for Verification and Validation (LVV) in Sheffield. The aim of this paper is to present the dataset, providing details on the structure, experimental design, and data acquired. The collected data is made freely and openly available with the intention that it serve as a benchmark dataset for challenges in full-scale structural dynamics. Here, the details pertaining to two test phases (frequency and time domain) are presented. So as to ensure that the presented dataset is able to function as a benchmark, some baseline-level results are additionally presented for the tasks of identification and prediction, using standard approaches. It is envisaged that advanced methodologies will demonstrate superiority by favourable comparison with the results presented here. Finally, some dataset-specific challenges are described, with a view to form a hierarchy of tasks and frame discussion over their relative difficulty

    UMSL Bulletin 2023-2024

    Get PDF
    The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Electrophysiological hallmarks for event relations and event roles in working memory

    Get PDF
    The ability to maintain events (i.e., interactions between/among objects) in working memory is crucial for our everyday cognition, yet the format of this representation is poorly understood. The current ERP study was designed to answer two questions: How is maintaining events (e.g., the tiger hit the lion) neurally different from maintaining item coordinations (e.g., the tiger and the lion)? That is, how is the event relation (present in events but not coordinations) represented? And how is the agent, or initiator of the event encoded differently from the patient, or receiver of the event during maintenance? We used a novel picture-sentence match-across-delay approach in which the working memory representation was “pinged” during the delay, replicated across two ERP experiments with Chinese and English materials. We found that maintenance of events elicited a long-lasting late sustained difference in posterior-occipital electrodes relative to non-events. This effect resembled the negative slow wave reported in previous studies of working memory, suggesting that the maintenance of events in working memory may impose a higher cost compared to coordinations. Although we did not observe significant ERP differences associated with pinging the agent vs. the patient during the delay, we did find that the ping appeared to dampen the ongoing sustained difference, suggesting a shift from sustained activity to activity silent mechanisms. These results suggest a new method by which ERPs can be used to elucidate the format of neural representation for events in working memory
    • 

    corecore