34 research outputs found

    Consensus Regularized Selection Based Prediction

    Get PDF
    Integrating regularization methods within a regression framework has become a popular choice for researchers to build predictive models with lower variance and better generalization. Regularizers also aid in building interpretable models with high-dimensional data which makes them very appealing. Regularizers in general are unique in nature as they cater to data specific features such as correlation, structured sparsity, and temporal smoothness. The problem of obtaining a consensus among such diverse regularizers is extremely important in order to determine the optimal regularizer for the model. This is called the consensus regularization problem which has not received much attention in the literature, due to the inherent difficulty associated with building an integrated regularization framework. To solve this problem, in this thesis, we propose a method to generate a committee of non-convex regularized linear regression models, and use a consensus criterion to determine the optimal model for prediction. Each corresponding non-convex optimization problem in the committee is solved efficiently using the cyclic-coordinate descent algorithm with the generalized thresholding operator. Our Consensus RegularIzation Selection based Prediction (CRISP) model is evaluated on electronic health records (EHRs) obtained from a large hospital for the chronic heart failure readmission problem. We also evaluate our model on various synthetic datasets to assess its performance. The results indicate that CRISP outperforms several state-of-the-art methods such as additive models and other competing non-convex regularized linear regression methods

    Machine learning in healthcare : an investigation into model stability

    Full text link
    Current machine learning algorithms, when directly applied to medical data, often fail to provide a good understanding of prognosis. This study provides three pathways to make predictive models stable and usable for healthcare. When tested on heart failure and diabetes patients from a local hospital, this study demonstrated 20% improvement over existing methods.<br /

    Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models

    Full text link
    Structured additive regression provides a general framework for complex Gaussian and non-Gaussian regression models, with predictors comprising arbitrary combinations of nonlinear functions and surfaces, spatial effects, varying coefficients, random effects and further regression terms. The large flexibility of structured additive regression makes function selection a challenging and important task, aiming at (1) selecting the relevant covariates, (2) choosing an appropriate and parsimonious representation of the impact of covariates on the predictor and (3) determining the required interactions. We propose a spike-and-slab prior structure for function selection that allows to include or exclude single coefficients as well as blocks of coefficients representing specific model terms. A novel multiplicative parameter expansion is required to obtain good mixing and convergence properties in a Markov chain Monte Carlo simulation approach and is shown to induce desirable shrinkage properties. In simulation studies and with (real) benchmark classification data, we investigate sensitivity to hyperparameter settings and compare performance to competitors. The flexibility and applicability of our approach are demonstrated in an additive piecewise exponential model with time-varying effects for right-censored survival times of intensive care patients with sepsis. Geoadditive and additive mixed logit model applications are discussed in an extensive appendix

    Building stable predictive models for healthcare applications: a data-driven approach

    Full text link
    Analysing health related data enable us to transform data into predictive models. These models can be used to accurately make diagnosis or prognosis about states of a patient, which cannot be investigated directly. The scope of this thesis lies within the realm of biomedical informatics, an interdisciplinary field at the crossroads of medicine and computer science

    Unmasking Clever Hans Predictors and Assessing What Machines Really Learn

    Full text link
    Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.Comment: Accepted for publication in Nature Communication

    Black Boxes and Theory Deserts: Deep Networks and Epistemic Opacity in the Cognitive Sciences

    Get PDF
    Cognitive scientists deal with technology in a very particular way: they use technology to understand perception, action, and cognition. This particular form of human-machine interaction (HMI) is very well illustrated by the use cognitive scientists make of artificial neural networks as models of cognitive systems and, more concretely, of the brain. However, the activity of cognitive scientists in this context suffers from the shortcoming of epistemic opacity: artificial neural networks are too difficult to interpret and understand, so in many cases they remain black boxes for researchers. In this paper, we provide a diagnostic for such epistemic opacity based on dominant cognitive science’s lack of theoretical resources to account for the activity of artificial neural networks when taken as models of the brain. Then, we offer the guidelines of a solution founded on the notion of information developed in ecological psychology

    Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments

    Full text link
    The spread of misinformation, propaganda, and flawed argumentation has been amplified in the Internet era. Given the volume of data and the subtlety of identifying violations of argumentation norms, supporting information analytics tasks, like content moderation, with trustworthy methods that can identify logical fallacies is essential. In this paper, we formalize prior theoretical work on logical fallacies into a comprehensive three-stage evaluation framework of detection, coarse-grained, and fine-grained classification. We adapt existing evaluation datasets for each stage of the evaluation. We employ three families of robust and explainable methods based on prototype reasoning, instance-based reasoning, and knowledge injection. The methods combine language models with background knowledge and explainable mechanisms. Moreover, we address data sparsity with strategies for data augmentation and curriculum learning. Our three-stage framework natively consolidates prior datasets and methods from existing tasks, like propaganda detection, serving as an overarching evaluation testbed. We extensively evaluate these methods on our datasets, focusing on their robustness and explainability. Our results provide insight into the strengths and weaknesses of the methods on different components and fallacy classes, indicating that fallacy identification is a challenging task that may require specialized forms of reasoning to capture various classes. We share our open-source code and data on GitHub to support further work on logical fallacy identification
    corecore