Search CORE

65 research outputs found

Learning and policy search in stochastic dynamical systems with Bayesian neural networks

Author: Depeweg S
Doshi-Velez F
Hernández-Lobato JM
Udluft S
Publication venue: 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings
Publication date: 01/01/2017
Field of study

We present an algorithm for policy search in stochastic dynamical systems using model-based reinforcement learning. The system dynamics are described with Bayesian neural networks (BNNs) that include stochastic input variables. These input variables allow us to capture complex statistical patterns in the transition dynamics (e.g. multi-modality and heteroskedasticity), which are usually missed by alternative modeling approaches. After learning the dynamics, our BNNs are then fed into an algorithm that performs random roll-outs and uses stochastic optimization for policy learning. We train our BNNs by minimizing a-divergences with a = 0.5, which usually produces better results than other techniques such as variational Bayes. We illustrate the performance of our method by solving a challenging problem where model-based approaches usually fail and by obtaining promising results in real-world scenarios including the control of a gas turbine and an industrial benchmark

arXiv.org e-Print Archive

Apollo (Cambridge)

CUED - Cambridge University Engineering Department

A Comparison of Human and Agent Reinforcement Learning in Partially Observable Domains.

Author: Doshi-Velez F
Ghahramani Z
Publication venue: cognitivesciencesociety.org
Publication date: 01/01/2011
Field of study

CUED - Cambridge University Engineering Department

Accelerated Gibbs sampling for the Indian buffet process

Author: Doshi Velez F
Ghahramani Z
Publication venue: ACM Press
Publication date: 18/06/2009
Field of study

CUED - Cambridge University Engineering Department

Correlated Non-Parametric Latent Feature Models.

Author: Doshi-Velez F
Ghahramani Z
Publication venue: AUAI Press
Publication date: 01/01/2009
Field of study

CUED - Cambridge University Engineering Department

Accelerated sampling for the Indian Buffet Process.

Author: Doshi-Velez F
Ghahramani Z
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2009
Field of study

CUED - Cambridge University Engineering Department

Predicting intervention onset in the ICU with switching state space models.

Author: Doshi-Velez F.
Ghassemi Marzyeh
Hughes M.
Wu M.
Publication venue: American Medical Informatics Association
Publication date: 10/07/2019
Field of study

The impact of many intensive care unit interventions has not been fully quantified, especially in heterogeneous patient populations. We train unsupervised switching state autoregressive models on vital signs from the public MIMIC-III database to capture patient movement between physiological states. We compare our learned states to static demographics and raw vital signs in the prediction of five ICU treatments: ventilation, vasopressor administra tion, and three transfusions. We show that our learned states, when combined with demographics and raw vital signs, improve prediction for most interventions even 4 or 8 hours ahead of onset. Our results are competitive with existing work while using a substantially larger and more diverse cohort of 36,050 patients. While custom classifiers can only target a specific clinical event, our model learns physiological states which can help with many interventions. Our robust patient state representations provide a path towards evidence-driven administration of clinical interventions.National Library of Medicine Biomedical Informatics Research Training (Grant NIH/NLM 2T15 LM007092-22

DSpace@MIT