34 research outputs found
Consensus Regularized Selection Based Prediction
Integrating regularization methods within a regression framework has become a popular choice for researchers to build predictive models with lower variance and better generalization. Regularizers also aid in building interpretable models with high-dimensional data which makes them very appealing. Regularizers in general are unique in nature as they cater to data specific features such as correlation, structured sparsity, and temporal smoothness. The problem of obtaining a consensus among such diverse regularizers is extremely important in order to determine the optimal regularizer for the model. This is called the consensus regularization problem which has not received much attention in the literature, due to the inherent difficulty associated with building an integrated regularization framework. To solve this problem, in this thesis, we propose a method to generate a committee of non-convex regularized linear regression models, and use a consensus criterion to determine the optimal model for prediction. Each corresponding non-convex optimization problem in the committee is solved efficiently using the cyclic-coordinate descent algorithm with the generalized thresholding operator. Our Consensus RegularIzation Selection based Prediction (CRISP) model is evaluated on electronic health records (EHRs) obtained from a large hospital for the chronic heart failure readmission problem. We also evaluate our model on various synthetic datasets to assess its performance. The results indicate that CRISP outperforms several state-of-the-art methods such as additive models and other competing non-convex regularized linear regression methods
Machine learning in healthcare : an investigation into model stability
Current machine learning algorithms, when directly applied to medical data, often fail to provide a good understanding of prognosis. This study provides three pathways to make predictive models stable and usable for healthcare. When tested on heart failure and diabetes patients from a local hospital, this study demonstrated 20% improvement over existing methods.<br /
Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models
Structured additive regression provides a general framework for complex
Gaussian and non-Gaussian regression models, with predictors comprising
arbitrary combinations of nonlinear functions and surfaces, spatial effects,
varying coefficients, random effects and further regression terms. The large
flexibility of structured additive regression makes function selection a
challenging and important task, aiming at (1) selecting the relevant
covariates, (2) choosing an appropriate and parsimonious representation of the
impact of covariates on the predictor and (3) determining the required
interactions. We propose a spike-and-slab prior structure for function
selection that allows to include or exclude single coefficients as well as
blocks of coefficients representing specific model terms. A novel
multiplicative parameter expansion is required to obtain good mixing and
convergence properties in a Markov chain Monte Carlo simulation approach and is
shown to induce desirable shrinkage properties. In simulation studies and with
(real) benchmark classification data, we investigate sensitivity to
hyperparameter settings and compare performance to competitors. The flexibility
and applicability of our approach are demonstrated in an additive piecewise
exponential model with time-varying effects for right-censored survival times
of intensive care patients with sepsis. Geoadditive and additive mixed logit
model applications are discussed in an extensive appendix
Building stable predictive models for healthcare applications: a data-driven approach
Analysing health related data enable us to transform data into predictive models. These models can be used to accurately make diagnosis or prognosis about states of a patient, which cannot be investigated directly. The scope of this thesis lies within the realm of biomedical informatics, an interdisciplinary field at the crossroads of medicine and computer science
Unmasking Clever Hans Predictors and Assessing What Machines Really Learn
Current learning machines have successfully solved hard application problems,
reaching high accuracy and displaying seemingly "intelligent" behavior. Here we
apply recent techniques for explaining decisions of state-of-the-art learning
machines and analyze various tasks from computer vision and arcade games. This
showcases a spectrum of problem-solving behaviors ranging from naive and
short-sighted, to well-informed and strategic. We observe that standard
performance evaluation metrics can be oblivious to distinguishing these diverse
problem solving behaviors. Furthermore, we propose our semi-automated Spectral
Relevance Analysis that provides a practically effective way of characterizing
and validating the behavior of nonlinear learning machines. This helps to
assess whether a learned model indeed delivers reliably for the problem that it
was conceived for. Furthermore, our work intends to add a voice of caution to
the ongoing excitement about machine intelligence and pledges to evaluate and
judge some of these recent successes in a more nuanced manner.Comment: Accepted for publication in Nature Communication
Black Boxes and Theory Deserts: Deep Networks and Epistemic Opacity in the Cognitive Sciences
Cognitive scientists deal with technology in a very particular way: they use technology to understand perception, action, and cognition. This particular form of human-machine interaction (HMI) is very well illustrated by the use cognitive scientists make of artificial neural networks as models of cognitive systems and, more concretely, of the brain. However, the activity of cognitive scientists in this context suffers from the shortcoming of epistemic opacity: artificial neural networks are too difficult to interpret and understand, so in many cases they remain black boxes for researchers. In this paper, we provide a diagnostic for such epistemic opacity based on dominant cognitive science’s lack of theoretical resources to account for the activity of artificial neural networks when taken as models of the brain. Then, we offer the guidelines of a solution founded on the notion of information developed in ecological psychology
Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments
The spread of misinformation, propaganda, and flawed argumentation has been
amplified in the Internet era. Given the volume of data and the subtlety of
identifying violations of argumentation norms, supporting information analytics
tasks, like content moderation, with trustworthy methods that can identify
logical fallacies is essential. In this paper, we formalize prior theoretical
work on logical fallacies into a comprehensive three-stage evaluation framework
of detection, coarse-grained, and fine-grained classification. We adapt
existing evaluation datasets for each stage of the evaluation. We employ three
families of robust and explainable methods based on prototype reasoning,
instance-based reasoning, and knowledge injection. The methods combine language
models with background knowledge and explainable mechanisms. Moreover, we
address data sparsity with strategies for data augmentation and curriculum
learning. Our three-stage framework natively consolidates prior datasets and
methods from existing tasks, like propaganda detection, serving as an
overarching evaluation testbed. We extensively evaluate these methods on our
datasets, focusing on their robustness and explainability. Our results provide
insight into the strengths and weaknesses of the methods on different
components and fallacy classes, indicating that fallacy identification is a
challenging task that may require specialized forms of reasoning to capture
various classes. We share our open-source code and data on GitHub to support
further work on logical fallacy identification