4,846 research outputs found

    UMSL Bulletin 2023-2024

    Get PDF
    The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp

    Determinantal Beam Search

    Full text link
    Beam search is a go-to strategy for decoding neural sequence models. The algorithm can naturally be viewed as a subset optimization problem, albeit one where the corresponding set function does not reflect interactions between candidates. Empirically, this leads to sets often exhibiting high overlap, e.g., strings may differ by only a single word. Yet in use-cases that call for multiple solutions, a diverse or representative set is often desired. To address this issue, we propose a reformulation of beam search, which we call determinantal beam search. Determinantal beam search has a natural relationship to determinantal point processes (DPPs), models over sets that inherently encode intra-set interactions. By posing iterations in beam search as a series of subdeterminant maximization problems, we can turn the algorithm into a diverse subset selection process. In a case study, we use the string subsequence kernel to explicitly encourage n-gram coverage in text generated from a sequence model. We observe that our algorithm offers competitive performance against other diverse set generation strategies in the context of language generation, while providing a more general approach to optimizing for diversity

    Using machine learning to predict pathogenicity of genomic variants throughout the human genome

    Get PDF
    Geschätzt mehr als 6.000 Erkrankungen werden durch Veränderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das Spleißen der mRNA in eine andere Isoform begünstigen. All diese Prozesse müssen überprüft werden, um die zum beschriebenen Phänotyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer Pathogenität. Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier präsentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst außerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores. Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells für das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. Außerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-Spleißing verbessern. Außerdem werden Varianteneffektmodelle aufgrund eines neuen, auf Allelhäufigkeit basierten, Trainingsdatensatz entwickelt. Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfügbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity. Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants. The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency. In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org

    PARAMETRIC APPROACHES TO BALANCE STORMWATER MANAGEMENT AND HUMAN WELLBEING WITHIN URBAN GREEN SPACE

    Full text link
    Through rapid urbanisation, urban green spaces (UGS) have become increasingly limited and valuable in high-density urban environments. However, meeting the diverse requirements of sustainable urban development often leads to conflicts in UGS usage. For example, the presence of stormwater treatment facilities may hinder residents' access to adjacent UGS. Traditional approaches to UGS design typically focus on separate evaluations of human wellbeing and stormwater management. However, using questionnaires, interviews, and surveys for human wellbeing evaluation can be challenging to generalise across different projects and cities. Additionally, professional hydrological models used for stormwater management require extensive knowledge of hydrology and struggle to integrate their 2D evaluation methods with 3D models. To address these challenges, this thesis proposes a novel framework to integrate the two types of analysis within a system for balancing the needs of human wellbeing and stormwater management in UGS design. The framework incorporates criteria and parameters for evaluating human wellbeing and stormwater management in a 3D model and introduces an approach to compare these two needs in terms of UGS area and suitable location. The contributions of this thesis to multi-objective UGS design are as follows: (1) defining human wellbeing evaluation through Accessibility and Usability assessment, which considers factors such as connectivity, walking distance, space enclosure, and space availability; (2) simplifying stormwater evaluation using particle systems and design curves to streamline complex hydrological models; (3) integrating the two evaluations by comparing their quantified requirements for UGS area and location; and (4) incorporating parameters to provide flexibility and accommodate various design scenarios and objectives. The advantages of this evaluation framework are demonstrated through two case studies: (1) the human wellbeing analysis based on spatial parameters in the framework shows sensitivity to site variations, including UGS quantity and distribution, population density, terrain, road context, height of void space, and more; (2) the simplified stormwater analysis effectively captures site variations represented by UGS quantity and distribution, building distribution, as well as terrain, providing recommendations for each UGS with different types and sizes of stormwater facilities. (3) With the features of spatial parameter evaluation, the framework is feasible to adjust relevant thresholds and include more parameters to respond to specific project needs. (4) By quantifying the two different requirements for UGS and comparing them, any UGS with high usage conflicts can be easily identified. By evaluating all proposed criteria for UGSs in the 3D model, designers can conveniently observe simulation and adjust design scenarios to address identified usage conflicts. Thus, the proposed evaluation framework in this thesis would be valuable in effectively supporting further multi-objective UGS design

    2023-2024 Boise State University Undergraduate Catalog

    Get PDF
    This catalog is primarily for and directed at students. However, it serves many audiences, such as high school counselors, academic advisors, and the public. In this catalog you will find an overview of Boise State University and information on admission, registration, grades, tuition and fees, financial aid, housing, student services, and other important policies and procedures. However, most of this catalog is devoted to describing the various programs and courses offered at Boise State

    Contested environmental futures: rankings, forecasts and indicators as sociotechnical endeavours

    Get PDF
    In a world where numbers and science are often taken as the voice of truth and reason, Quantitative Devices (QDs) represent the epitome of policy driven by facts rather than hunches. Despite the scholarly interest in understanding the role of quantification in policy, the actual production of rankings, forecasts, indexes and other QDs has, to a great extent, been left unattended. While appendixes and technical notebooks offer an explanation of how these devices are produced, they exclude aspects of their making that are arbitrarily considered "mundane." It is in the everyday performances at research centres that the micropolitics of knowledge production, imaginaries, and frustrations merge. These are vital dimensions to understand the potential, limitations and ethical consequences of QDs. Using two participant observations as the starting point, this thesis offers a comprehensive critical analysis of the processes through which university-based research centres create QDs that represent the world. It addresses how researchers conceive quantitative data. It pays attention to the discourses of hope and expectation embedded in the devices. Finally, it considers the ethics of creating devices that cannot be replicated independently of their place of production. Two QDs were analysed: the Violence Early Warning System (ViEWS) and the Environmental Performance Index (EPI). At Uppsala University, researchers created ViEWS to forecast the probability of drought-driven conflicts within the next 100 years. The EPI, produced at the Yale Centre for Environmental Law and Policy, ranks the performance of countries' environmental policies. This thesis challenges existing claims within Science and Technology Studies and the Sociology of Quantification that QDs co-produce knowledge within their realms. I argue that these devices act as vehicles for sociotechnical infrastructures to be consolidated with little debate among policymakers, given their understanding as scientific and objective tools. Moreover, for an indicator to be incorporated within a QD, it needs to be deemed as relevant for those making the devices but also valuable enough to have been previously quantified by data providers. Even more, existing sociotechnical inequalities, power relations and epistemic injustices could impede disadvantaged communities' (e.g., in the Global South) ability to challenge metrics originated in centres in the Global North. This thesis, therefore, demonstrates how the future QDs propose is unilateral and does not acknowledge the myriad possibilities that might arise from a diversity of worldviews. In other words, they cast a future designed to fit under the current status quo. In sum, through two QDs focused on environmental-related, this thesis launches an inquiry into the elements that make up the imaginaries they propose following the everyday life of their producers. To achieve this, I discuss two core elements. First, the role of tacit knowledge and sociotechnical inequalities in reinforcing power relations between those with the means to quantify and those who might only accommodate proposed futures. Second, the dynamics between research centres and data providers in relation to what is quantified. By scrutinising mundanity, this work is a step forward in understanding the construction of sociotechnical imaginaries and infrastructures

    ManyDG: Many-domain Generalization for Healthcare Applications

    Full text link
    The vast amount of health data has been continuously collected for each patient, providing opportunities to support diverse healthcare predictive tasks such as seizure detection and hospitalization prediction. Existing models are mostly trained on other patients data and evaluated on new patients. Many of them might suffer from poor generalizability. One key reason can be overfitting due to the unique information related to patient identities and their data collection environments, referred to as patient covariates in the paper. These patient covariates usually do not contribute to predicting the targets but are often difficult to remove. As a result, they can bias the model training process and impede generalization. In healthcare applications, most existing domain generalization methods assume a small number of domains. In this paper, considering the diversity of patient covariates, we propose a new setting by treating each patient as a separate domain (leading to many domains). We develop a new domain generalization method ManyDG, that can scale to such many-domain problems. Our method identifies the patient domain covariates by mutual reconstruction and removes them via an orthogonal projection step. Extensive experiments show that ManyDG can boost the generalization performance on multiple real-world healthcare tasks (e.g., 3.7% Jaccard improvements on MIMIC drug recommendation) and support realistic but challenging settings such as insufficient data and continuous learning.Comment: The paper has been accepted by ICLR 2023, refer to https://openreview.net/forum?id=lcSfirnflpW. We will release the data and source codes here https://github.com/ycq091044/ManyD

    Monetizing Explainable AI: A Double-edged Sword

    Full text link
    Algorithms used by organizations increasingly wield power in society as they decide the allocation of key resources and basic goods. In order to promote fairer, juster, and more transparent uses of such decision-making power, explainable artificial intelligence (XAI) aims to provide insights into the logic of algorithmic decision-making. Despite much research on the topic, consumer-facing applications of XAI remain rare. A central reason may be that a viable platform-based monetization strategy for this new technology has yet to be found. We introduce and describe a novel monetization strategy for fusing algorithmic explanations with programmatic advertising via an explanation platform. We claim the explanation platform represents a new, socially-impactful, and profitable form of human-algorithm interaction and estimate its potential for revenue generation in the high-risk domains of finance, hiring, and education. We then consider possible undesirable and unintended effects of monetizing XAI and simulate these scenarios using real-world credit lending data. Ultimately, we argue that monetizing XAI may be a double-edged sword: while monetization may incentivize industry adoption of XAI in a variety of consumer applications, it may also conflict with the original legal and ethical justifications for developing XAI. We conclude by discussing whether there may be ways to responsibly and democratically harness the potential of monetized XAI to provide greater consumer access to algorithmic explanations

    Modelling, Monitoring, Control and Optimization for Complex Industrial Processes

    Get PDF
    This reprint includes 22 research papers and an editorial, collected from the Special Issue "Modelling, Monitoring, Control and Optimization for Complex Industrial Processes", highlighting recent research advances and emerging research directions in complex industrial processes. This reprint aims to promote the research field and benefit the readers from both academic communities and industrial sectors

    ProGAP: Progressive Graph Neural Networks with Differential Privacy Guarantees

    Full text link
    Graph Neural Networks (GNNs) have become a popular tool for learning on graphs, but their widespread use raises privacy concerns as graph data can contain personal or sensitive information. Differentially private GNN models have been recently proposed to preserve privacy while still allowing for effective learning over graph-structured datasets. However, achieving an ideal balance between accuracy and privacy in GNNs remains challenging due to the intrinsic structural connectivity of graphs. In this paper, we propose a new differentially private GNN called ProGAP that uses a progressive training scheme to improve such accuracy-privacy trade-offs. Combined with the aggregation perturbation technique to ensure differential privacy, ProGAP splits a GNN into a sequence of overlapping submodels that are trained progressively, expanding from the first submodel to the complete model. Specifically, each submodel is trained over the privately aggregated node embeddings learned and cached by the previous submodels, leading to an increased expressive power compared to previous approaches while limiting the incurred privacy costs. We formally prove that ProGAP ensures edge-level and node-level privacy guarantees for both training and inference stages, and evaluate its performance on benchmark graph datasets. Experimental results demonstrate that ProGAP can achieve up to 5%-10% higher accuracy than existing state-of-the-art differentially private GNNs
    corecore