Search CORE

7,539 research outputs found

DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret

Author: Hu Yichun
Kallus Nathan
Publication venue
Publication date: 05/06/2020
Field of study

Dynamic treatment regimes (DTRs) are personalized, adaptive, multi-stage treatment plans that adapt treatment decisions both to an individual's initial features and to intermediate outcomes and features at each subsequent stage, which are affected by decisions in prior stages. Examples include personalized first- and second-line treatments of chronic conditions like diabetes, cancer, and depression, which adapt to patient response to first-line treatment, disease progression, and individual characteristics. While existing literature mostly focuses on estimating the optimal DTR from offline data such as from sequentially randomized trials, we study the problem of developing the optimal DTR in an online manner, where the interaction with each individual affect both our cumulative reward and our data collection for future learning. We term this the DTR bandit problem. We propose a novel algorithm that, by carefully balancing exploration and exploitation, is guaranteed to achieve rate-optimal regret when the transition and reward models are linear. We demonstrate our algorithm and its benefits both in synthetic experiments and in a case study of adaptive treatment of major depressive disorder using real-world data

arXiv.org e-Print Archive

Computational neurorehabilitation: modeling plasticity and learning to predict recovery

Author: Burdet E
Casadio M
Krakauer JW
Kwakkel G
Lang CE
Reinkensmeyer DJ
Schweighofer N
Swinnen SP
Ward NS
Publication venue: BioMed Central
Publication date: 01/01/2016
Field of study

Despite progress in using computational approaches to inform medicine and neuroscience in the last 30 years, there have been few attempts to model the mechanisms underlying sensorimotor rehabilitation. We argue that a fundamental understanding of neurologic recovery, and as a result accurate predictions at the individual level, will be facilitated by developing computational models of the salient neural processes, including plasticity and learning systems of the brain, and integrating them into a context specific to rehabilitation. Here, we therefore discuss Computational Neurorehabilitation, a newly emerging field aimed at modeling plasticity and motor learning to understand and improve movement recovery of individuals with neurologic impairment. We first explain how the emergence of robotics and wearable sensors for rehabilitation is providing data that make development and testing of such models increasingly feasible. We then review key aspects of plasticity and motor learning that such models will incorporate. We proceed by discussing how computational neurorehabilitation models relate to the current benchmark in rehabilitation modeling – regression-based, prognostic modeling. We then critically discuss the first computational neurorehabilitation models, which have primarily focused on modeling rehabilitation of the upper extremity after stroke, and show how even simple models have produced novel ideas for future investigation. Finally, we conclude with key directions for future research, anticipating that soon we will see the emergence of mechanistic models of motor recovery that are informed by clinical imaging results and driven by the actual movement content of rehabilitation therapy as well as wearable sensor-based records of daily activity

Springer - Publisher Connector

ZENODO

Digital Commons@Becker

UCL Discovery

PubMed Central

eScholarship - University of California

Spiral - Imperial College Digital Repository

Publications at Bielefeld University

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Archivio istituzionale della ricerca - Università di Genova

Deep Reinforcement Learning for Event-Triggered Control

Author: Baumann Dominik
Martius Georg
Trimpe Sebastian
Zhu Jia-Jie
Publication venue
Publication date: 01/01/2018
Field of study

Event-triggered control (ETC) methods can achieve high-performance control with a significantly lower number of samples compared to usual, time-triggered methods. These frameworks are often based on a mathematical model of the system and specific designs of controller and event trigger. In this paper, we show how deep reinforcement learning (DRL) algorithms can be leveraged to simultaneously learn control and communication behavior from scratch, and present a DRL approach that is particularly suitable for ETC. To our knowledge, this is the first work to apply DRL to ETC. We validate the approach on multiple control tasks and compare it to model-based event-triggering frameworks. In particular, we demonstrate that it can, other than many model-based ETC designs, be straightforwardly applied to nonlinear systems

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Adaptive Interventions Treatment Modelling and Regimen Optimization Using Sequential Multiple Assignment Randomized Trials (Smart) and Q-Learning

Author: Baniya Abiral
Publication venue: Open PRAIRIE: Open Public Research Access Institutional Repository and Information Exchange
Publication date: 01/01/2018
Field of study

Nowadays, pharmacological practices are focused on a single best treatment to treat a disease which sounds impractical as the same treatment may not work the same way for every patient. Thus, there is a need of shift towards more patient-centric rather than disease-centric approach, in which personal characteristics of a patient or biomarkers are used to determine the tailored optimal treatment. The “one size fits all” concept is contradicted by research area of personalized medicine. The Sequential Multiple Assignment Randomized Trial (SMART) is a multi-stage trials to inform the development of dynamic treatment regimens (DTR’s). In SMART, a subject is randomized through various stages of treatment where each stage corresponds to a treatment decision. These types of adaptive interventions are individualized and are repeatedly adjusted across time based on patient’s individual clinical characteristics and ongoing performance. The reinforcement learning (Q-learning), a computational algorithm for optimization of treatment regimens to maximize desired clinical outcome is used in optimizing the sequence of treatments. This statistical model contains regression analysis for function approximation of data from clinical trials. The model will predict a series of regimens across time, depending on the biomarkers of a new participant for optimizing the weight management decision rules. Additionally, for implementing reinforcement learning algorithm, as it is one of the machine learning approach there should be a training data from which we can train the model or in other words approximate the function, Q-functions. Then the approximated functions of the model should be evaluated and after the evaluation they should be further tested for applying the treatment rule to future patients. Thus, in this thesis first the dataset obtained from Sanford Health is first restructured, to make it conducive for our model utilization. The restructured training data is used in regression analysis for approximating the Q-functions. The regression analysis gives the estimates of coefficients associated to each covariate in the regression function. The evaluation of model goodness-of-fit and fulfillment of underlying assumptions of simple linear regression are performed using regression summary table and residual diagnostic plots. As a two stage SMART design is put into practice, the Q-functions for these two stages are needed to be estimated through multiple regression using linear model. Now, finally after analyzing the fit adequacy the model is applied for prescribing treatment rules to future patients. The prognostic and predictive covariates of new patient is acquired and the optimal treatment rule for each treatment decision stage is assigned as the treatment that results in maximum estimated values of Q-functions. The estimated values of each regime were also computed using the value estimator function and regime that has the maximum estimated value was chosen as optimal treatment decision rule

Public Research Access Institutional Repository and Information Exchange