Search CORE

27 research outputs found

A reinforcement learning design for HIV clinical trials

Author: Parbhoo Sonali
Publication venue
Publication date: 30/07/2014
Field of study

A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Master of Science. Johannesburg, 2014.Determining e ective treatment strategies for life-threatening illnesses such as HIV is a signi cant problem in clinical research. Currently, HIV treatment involves using combinations of anti-HIV drugs to inhibit the formation of drug-resistant strains. From a clinician's perspective, this usually requires careful selection of drugs on the basis of an individual's immune responses at a particular time. As the number of drugs available for treatment increases, this task becomes di cult. In a clinical trial setting, the task is even more challenging since experience using new drugs is limited. For these reasons, this research examines whether machine learning techniques, and more speci cally batch reinforcement learning, can be used for the purposes of determining the appropriate treatment for an HIV-infected patient at a particular time. To do so, we consider using tted Q-iteration with extremely randomized trees, neural tted Q-iteration and least squares policy iteration. The use of batch reinforcement learning means that samples of patient data are captured prior to learning to avoid imposing risks on a patient. Because samples are re-used, these methods are data-e cient and particularly suited to situations where large amounts of data are unavailable. We apply each of these learning methods to both numerically generated and real data sets. Results from this research highlight the advantages and disadvantages associated with each learning technique. Real data testing has revealed that these batch reinforcement learning techniques have the ability to suggest treatments that are reasonably consistent with those prescribed by clinicians. The inclusion of additional state variables describing more about an individual's health could further improve this learning process. Ultimately, the use of such reinforcement learning methods could be coupled with a clinician's knowledge for enhanced treatment design

Wits Institutional Repository on DSPACE

Causal inference and interpretable machine learning for personalised medicine

Author: Parbhoo Sonali
Publication venue
Publication date: 01/01/2019
Field of study

In this thesis, we discuss the importance of causal knowledge in healthcare for tailoring treatments to a patient's needs. We propose three different causal models for reasoning about the effects of medical interventions on patients with HIV and sepsis, based on observational data. Both application areas are challenging as a result of patient heterogeneity and the existence of confounding that influences patient outcomes. Our first contribution is a treatment policy mixture model that combines nonparametric, kernel-based learning with model-based reinforcement learning to reason about a series of treatments and their effects. These methods each have their own strengths: non-parametric methods can accurately predict treatment effects where there are overlapping patient instances or where data is abundant; model-based reinforcement learning generalises better in outlier situations by learning a belief state representation of confounding. The overall policy mixture model learns a partition of the space of heterogeneous patients such that we can personalise treatments accordingly. Our second contribution incorporates knowledge from kernel-based reasoning directly into a reinforcement learning model by learning a combined belief state representation. In doing so, we can use the model to simulate counterfactual scenarios to reason about what would happen to a patient if we intervened in a particular way and how would their specific outcomes change. As a result, we may tailor therapies according to patient-specific scenarios. Our third contribution is a reformulation of the information bottleneck problem for learning an interpretable, low-dimensional representation of confounding for medical decision-making. The approach uses the relevance of information to perform a sufficient reduction of confounding. Based on this reduction, we learn equivalence classes among groups of patients, such that we may transfer knowledge to patients with incomplete covariate information at test time. By conditioning on the sufficient statistic we can accurately infer treatment effects on both a population and subgroup level. Our final contribution is the development of a novel regularisation strategy that can be applied to deep machine learning models to enforce clinical interpretability. We specifically train deep time-series models such that their predictions have high accuracy while being closely modelled by small decision trees that can be audited easily by medical experts. Broadly, our tree-based explanations can be used to provide additional context in scenarios where reasoning about treatment effects may otherwise be difficult. Importantly, each of the models we present is an attempt to bring about more understanding in medical applications to inform better decision-making overall

edoc

Guarantee Regions for Local Explanations

Author: Doshi-Velez Finale
Havasi Marton
Parbhoo Sonali
Publication venue
Publication date: 20/02/2024
Field of study

Interpretability methods that utilise local surrogate models (e.g. LIME) are very good at describing the behaviour of the predictive model at a point of interest, but they are not guaranteed to extrapolate to the local region surrounding the point. However, overfitting to the local curvature of the predictive model and malicious tampering can significantly limit extrapolation. We propose an anchor-based algorithm for identifying regions in which local explanations are guaranteed to be correct by explicitly describing those intervals along which the input features can be trusted. Our method produces an interpretable feature-aligned box where the prediction of the local surrogate model is guaranteed to match the predictive model. We demonstrate that our algorithm can be used to find explanations with larger guarantee regions that better cover the data manifold compared to existing baselines. We also show how our method can identify misleading local explanations with significantly poorer guarantee regions

arXiv.org e-Print Archive

Leveraging Factored Action Spaces for Off-Policy Evaluation

Author: Parbhoo Sonali
Rebello Aaman
Tang Shengpu
Wiens Jenna
Publication venue
Publication date: 13/07/2023
Field of study

Off-policy evaluation (OPE) aims to estimate the benefit of following a counterfactual sequence of actions, given data collected from executed sequences. However, existing OPE estimators often exhibit high bias and high variance in problems involving large, combinatorial action spaces. We investigate how to mitigate this issue using factored action spaces i.e. expressing each action as a combination of independent sub-actions from smaller action spaces. This approach facilitates a finer-grained analysis of how actions differ in their effects. In this work, we propose a new family of "decomposed" importance sampling (IS) estimators based on factored action spaces. Given certain assumptions on the underlying problem structure, we prove that the decomposed IS estimators have less variance than their original non-decomposed versions, while preserving the property of zero bias. Through simulations, we empirically verify our theoretical results, probing the validity of various assumptions. Provided with a technique that can derive the action space factorisation for a given problem, our work shows that OPE can be improved "for free" by utilising this inherent problem structure.Comment: Main paper: 8 pages, 7 figures. Appendix: 30 pages, 17 figures. Accepted at ICML 2023 Workshop on Counterfactuals in Minds and Machines, Honolulu, Hawaii, USA. Camera ready versio

arXiv.org e-Print Archive

Beyond Sparsity: Tree Regularization of Deep Models for Interpretability

Author: Doshi-Velez Finale
Hughes Michael C.
Parbhoo Sonali
Roth Volker
Wu Mike
Zazzi Maurizio
Publication venue
Publication date: 16/11/2017
Field of study

The lack of interpretability remains a key barrier to the adoption of deep models in many applications. In this work, we explicitly regularize deep models so human users might step through the process behind their predictions in little time. Specifically, we train deep time-series models so their class-probability predictions have high accuracy while being closely modeled by decision trees with few nodes. Using intuitive toy examples as well as medical tasks for treating sepsis and HIV, we demonstrate that this new tree regularization yields models that are easier for humans to simulate than simpler L1 or L2 penalties without sacrificing predictive power.Comment: To appear in AAAI 2018. Contains 9-page main paper and appendix with supplementary materia

arXiv.org e-Print Archive

edoc

Informed MCMC with Bayesian Neural Networks for Facial Image Analysis

Author: Kortylewski Adam
Morel-Forster Andreas
Parbhoo Sonali
Roth Volker
Vetter Thomas
Wieczorek Aleksander
Wieser Mario
Publication venue
Publication date: 01/01/2018
Field of study

Computer vision tasks are difficult because of the large variability in the data that is induced by changes in light, background, partial occlusion as well as the varying pose, texture, and shape of objects. Generative approaches to computer vision allow us to overcome this difficulty by explicitly modeling the physical image formation process. Using generative object models, the analysis of an observed image is performed via Bayesian inference of the posterior distribution. This conceptually simple approach tends to fail in practice because of several difficulties stemming from sampling the posterior distribution: high-dimensionality and multi-modality of the posterior distribution as well as expensive simulation of the rendering process. The main difficulty of sampling approaches in a computer vision context is choosing the proposal distribution accurately so that maxima of the posterior are explored early and the algorithm quickly converges to a valid image interpretation. In this work, we propose to use a Bayesian Neural Network for estimating an image dependent proposal distribution. Compared to a standard Gaussian random walk proposal, this accelerates the sampler in finding regions of the posterior with high value. In this way, we can significantly reduce the number of samples needed to perform facial image analysis.Comment: Accepted to the Bayesian Deep Learning Workshop at NeurIPS 201

arXiv.org e-Print Archive

edoc

Decision-Focused Model-based Reinforcement Learning for Reward Transfer

Author: Doshi-Velez Finale
Gottesman Omer
Parbhoo Sonali
Sharma Abhishek
Publication venue
Publication date: 01/01/2024
Field of study

Decision-focused (DF) model-based reinforcement learning has recently been introduced as a powerful algorithm that can focus on learning the MDP dynamics that are most relevant for obtaining high returns. While this approach increases the agent's performance by directly optimizing the reward, it does so by learning less accurate dynamics from a maximum likelihood perspective. We demonstrate that when the reward function is defined by preferences over multiple objectives, the DF model may be sensitive to changes in the objective preferences.In this work, we develop the robust decision-focused (RDF) algorithm, which leverages the non-identifiability of DF solutions to learn models that maximize expected returns while simultaneously learning models that transfer to changes in the preference over multiple objectives. We demonstrate the effectiveness of RDF on two synthetic domains and two healthcare simulators, showing that it significantly improves the robustness of DF model learning to changes in the reward function without compromising training-time return

arXiv.org e-Print Archive