7,539 research outputs found
DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret
Dynamic treatment regimes (DTRs) are personalized, adaptive, multi-stage
treatment plans that adapt treatment decisions both to an individual's initial
features and to intermediate outcomes and features at each subsequent stage,
which are affected by decisions in prior stages. Examples include personalized
first- and second-line treatments of chronic conditions like diabetes, cancer,
and depression, which adapt to patient response to first-line treatment,
disease progression, and individual characteristics. While existing literature
mostly focuses on estimating the optimal DTR from offline data such as from
sequentially randomized trials, we study the problem of developing the optimal
DTR in an online manner, where the interaction with each individual affect both
our cumulative reward and our data collection for future learning. We term this
the DTR bandit problem. We propose a novel algorithm that, by carefully
balancing exploration and exploitation, is guaranteed to achieve rate-optimal
regret when the transition and reward models are linear. We demonstrate our
algorithm and its benefits both in synthetic experiments and in a case study of
adaptive treatment of major depressive disorder using real-world data
Computational neurorehabilitation: modeling plasticity and learning to predict recovery
Despite progress in using computational approaches to inform medicine and neuroscience in the last 30 years, there have been few attempts to model the mechanisms underlying sensorimotor rehabilitation. We argue that a fundamental understanding of neurologic recovery, and as a result accurate predictions at the individual level, will be facilitated by developing computational models of the salient neural processes, including plasticity and learning systems of the brain, and integrating them into a context specific to rehabilitation. Here, we therefore discuss Computational Neurorehabilitation, a newly emerging field aimed at modeling plasticity and motor learning to understand and improve movement recovery of individuals with neurologic impairment. We first explain how the emergence of robotics and wearable sensors for rehabilitation is providing data that make development and testing of such models increasingly feasible. We then review key aspects of plasticity and motor learning that such models will incorporate. We proceed by discussing how computational neurorehabilitation models relate to the current benchmark in rehabilitation modeling – regression-based, prognostic modeling. We then critically discuss the first computational neurorehabilitation models, which have primarily focused on modeling rehabilitation of the upper extremity after stroke, and show how even simple models have produced novel ideas for future investigation. Finally, we conclude with key directions for future research, anticipating that soon we will see the emergence of mechanistic models of motor recovery that are informed by clinical imaging results and driven by the actual movement content of rehabilitation therapy as well as wearable sensor-based records of daily activity
Deep Reinforcement Learning for Event-Triggered Control
Event-triggered control (ETC) methods can achieve high-performance control
with a significantly lower number of samples compared to usual, time-triggered
methods. These frameworks are often based on a mathematical model of the system
and specific designs of controller and event trigger. In this paper, we show
how deep reinforcement learning (DRL) algorithms can be leveraged to
simultaneously learn control and communication behavior from scratch, and
present a DRL approach that is particularly suitable for ETC. To our knowledge,
this is the first work to apply DRL to ETC. We validate the approach on
multiple control tasks and compare it to model-based event-triggering
frameworks. In particular, we demonstrate that it can, other than many
model-based ETC designs, be straightforwardly applied to nonlinear systems
Adaptive Interventions Treatment Modelling and Regimen Optimization Using Sequential Multiple Assignment Randomized Trials (Smart) and Q-Learning
Nowadays, pharmacological practices are focused on a single best treatment to treat a disease which sounds impractical as the same treatment may not work the same way for every patient. Thus, there is a need of shift towards more patient-centric rather than disease-centric approach, in which personal characteristics of a patient or biomarkers are used to determine the tailored optimal treatment. The “one size fits all” concept is contradicted by research area of personalized medicine. The Sequential Multiple Assignment Randomized Trial (SMART) is a multi-stage trials to inform the development of dynamic treatment regimens (DTR’s). In SMART, a subject is randomized through various stages of treatment where each stage corresponds to a treatment decision. These types of adaptive interventions are individualized and are repeatedly adjusted across time based on patient’s individual clinical characteristics and ongoing performance. The reinforcement learning (Q-learning), a computational algorithm for optimization of treatment regimens to maximize desired clinical outcome is used in optimizing the sequence of treatments. This statistical model contains regression analysis for function approximation of data from clinical trials. The model will predict a series of regimens across time, depending on the biomarkers of a new participant for optimizing the weight management decision rules. Additionally, for implementing reinforcement learning algorithm, as it is one of the machine learning approach there should be a training data from which we can train the model or in other words approximate the function, Q-functions. Then the approximated functions of the model should be evaluated and after the evaluation they should be further tested for applying the treatment rule to future patients. Thus, in this thesis first the dataset obtained from Sanford Health is first restructured, to make it conducive for our model utilization. The restructured training data is used in regression analysis for approximating the Q-functions. The regression analysis gives the estimates of coefficients associated to each covariate in the regression function. The evaluation of model goodness-of-fit and fulfillment of underlying assumptions of simple linear regression are performed using regression summary table and residual diagnostic plots. As a two stage SMART design is put into practice, the Q-functions for these two stages are needed to be estimated through multiple regression using linear model. Now, finally after analyzing the fit adequacy the model is applied for prescribing treatment rules to future patients. The prognostic and predictive covariates of new patient is acquired and the optimal treatment rule for each treatment decision stage is assigned as the treatment that results in maximum estimated values of Q-functions. The estimated values of each regime were also computed using the value estimator function and regime that has the maximum estimated value was chosen as optimal treatment decision rule
- …