Search CORE

7 research outputs found

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

Author: Doshi-Velez Finale P
Pineau Joelle
Roy Nicholas
Publication venue: 'Elsevier BV'
Publication date: 01/02/2012
Field of study

Acting in domains where an agent must plan several steps ahead to achieve a goal can be a challenging task, especially if the agentʼs sensors provide only noisy or partial information. In this setting, Partially Observable Markov Decision Processes (POMDPs) provide a planning framework that optimally trades between actions that contribute to the agentʼs knowledge and actions that increase the agentʼs immediate reward. However, the task of specifying the POMDPʼs parameters is often onerous. In particular, setting the immediate rewards to achieve a desired balance between information-gathering and acting is often not intuitive. In this work, we propose an approximation based on minimizing the immediate Bayes risk for choosing actions when transition, observation, and reward models are uncertain. The Bayes-risk criterion avoids the computational intractability of solving a POMDP with a multi-dimensional continuous state space; we show it performs well in a variety of problems. We use policy queries—in which we ask an expert for the correct action—to infer the consequences of a potential pitfall without experiencing its effects. More important for human–robot interaction settings, policy queries allow the agent to learn the reward model without the reward values ever being specified

DSpace@MIT

Elsevier - Publisher Connector

Crossref

A Bayesian nonparametric approach to modeling battery health

Author: Doshi-Velez Finale P.
Joseph Joshua Mason
Roy Nicholas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2012
Field of study

The batteries of many consumer products are both a substantial portion of the product's cost and commonly a first point of failure. Accurately predicting remaining battery life can lower costs by reducing unnecessary battery replacements. Unfortunately, battery dynamics are extremely complex, and we often lack the domain knowledge required to construct a model by hand. In this work, we take a data-driven approach and aim to learn a model of battery time-to-death from training data. Using a Dirichlet process prior over mixture weights, we learn an infinite mixture model for battery health. The Bayesian aspect of our model helps to avoid over-fitting while the nonparametric nature of the model allows the data to control the size of the model, preventing under-fitting. We demonstrate our model's effectiveness by making time-to-death predictions using real data from nickel-metal hydride battery packs.United States. Army Research Office (Nostra Project STTR W911NF-08-C-0066)iRobo

DSpace@MIT

Crossref

Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning

Author: Doshi-Velez Finale P.
Pfau David
Roy Nicholas
Wood Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2013
Field of study

Making intelligent decisions from incomplete information is critical in many applications: for example, robots must choose actions based on imperfect sensors, and speech-based interfaces must infer a user’s needs from noisy microphone inputs. What makes these tasks hard is that often we do not have a natural representation with which to model the domain and use for choosing actions; we must learn about the domain’s properties while simultaneously performing the task. Learning a representation also involves trade-offs between modeling the data that we have seen previously and being able to make predictions about new data. This article explores learning representations of stochastic systems using Bayesian nonparametric statistics. Bayesian nonparametric methods allow the sophistication of a representation to scale gracefully with the complexity in the data. Our main contribution is a careful empirical evaluation of how representations learned using Bayesian nonparametric methods compare to other standard learning approaches, especially in support of planning and control. We show that the Bayesian aspects of the methods result in achieving state-of-the-art performance in decision making with relatively few samples, while the nonparametric aspects often result in fewer computations. These results hold across a variety of different techniques for choosing actions given a representation

DSpace@MIT

Nonparametric Bayesian Policy Priors for Reinforcement Learning

Author: Doshi-Velez Finale P.
Roy Nicholas
Tenenbaum Joshua B.
Wingate David
Publication venue: Neural Information Processing Systems Foundation
Publication date: 01/01/2010
Field of study

We consider reinforcement learning in partially observable domains where the agent can query an expert for demonstrations. Our nonparametric Bayesian approach combines model knowledge, inferred from expert information and independent exploration, with policy knowledge inferred from expert trajectories. We introduce priors that bias the agent towards models with both simple representations and simple policies, resulting in improved policy and model learning

CiteSeerX

DSpace@MIT

Infinite dynamic bayesian networks

Author: Doshi-Velez Finale P.
Roy Nicholas
Tenenbaum Joshua B.
Wingate David
Publication venue: International Machine Learning Society
Publication date: 01/06/2011
Field of study

We present the infinite dynamic Bayesian network model (iDBN), a nonparametric, factored state-space model that generalizes dynamic Bayesian networks (DBNs). The iDBN can infer every aspect of a DBN: the number of hidden factors, the number of values each factor can take, and (arbitrarily complex) connections and conditionals between factors and observations. In this way, the iDBN generalizes other nonparametric state space models, which until now generally focused on binary hidden nodes and more restricted connection structures. We show how this new prior allows us to find interesting structure in benchmark tests and on two realworld datasets involving weather data and neural information flow networks.Massachusetts Institute of Technology (Hugh Hampton Young Memorial Fund Fellowship)United States. Air Force Office of Scientific Research (AFOSR FA9550-07-1-0075

CiteSeerX

DSpace@MIT

A Bayesian Nonparametric Approach to Modeling Motion Patterns

Author: A. Girard
A. Raftery
Albert S. Huang
B. D. Ziebart
C. E. Rasmussen
C. E. Rasmussen
C. J. Paciorek
C. M. Bishop
C. Tay
D. Ashbrook
D. Hsu
D. Patterson
E. Fox
E. Meeds
E. Snelson
Finale Doshi-Velez
H. Dia
H. Kurniawati
J. Ko
J. Letchner
J. M. Joseph
J. Pineau
Joshua Joseph
L. Csató
L. Liao
M. L. Puterman
M. P. Deisenroth
Nicholas Roy
P. Boyle
R. He
S. A. Miller
S. Duane
S. Ross
W. Meiring
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2010
Field of study

The most difficult—and often most essential— aspect of many interception and tracking tasks is constructing motion models of the targets to be found. Experts can often provide only partial information, and fitting parameters for complex motion patterns can require large amounts of training data. Specifying how to parameterize complex motion patterns is in itself a difficult task. In contrast, nonparametric models are very flexible and generalize well with relatively little training data. We propose modeling target motion patterns as a mixture of Gaussian processes (GP) with a Dirichlet process (DP) prior over mixture weights. The GP provides a flexible representation for each individual motion pattern, while the DP assigns observed trajectories to particular motion patterns. Both automatically adjust the complexity of the motion model based on the available data. Our approach outperforms several parametric models on a helicopter-based car-tracking task on data collected from the greater Boston area

DSpace@MIT

Crossref

Does the GDPR Help or Hinder Fair Algorithmic Decision-Making?

Author: &apos
A Acquisti
A Blum
A C Doyle
A Caliskan
A Chouldechova
A Doria
A Edward
A G Ferguson
A Joshua
A N Cormack
A Pentland
A Tutt
Abelardo Pardo
Adi Kramer
Alan Mislove
Alexandra Chouldechova
Andrew Tutt
Angwin
Avri Doria
Avrim Blum
B Kim
B Lagan
B Sotiris
Baltz
Been Kim
Bernard Lagan
Binns
Bishop
Borgesius
Brakel
Brauneis
Brown
Brown
C Cadwalladr
C D Woodward
C Duhigg
C M Bishop
C Newton
C S Sumner
Carole Cadwalladr
Cdouglas Woodward
Charles Duhigg
Clifford
Cormen
D Beynon
D Davidson
D Korff
D Victor
Damon
Dan Beynon
Danks
Delfi As V Estonia
Diakopoulos
Diakopoulos
Doshi-Velez
Douwe Korff
Drouin
Dwork
E Cresci
E Denham
E Felton
E Finn
E Hoover
E Lachaud
E Pariser
Ed Finn
Ejembi
Eli Pariser
Elizabeth Denham
Emmanuel Felton
Eric Hoover
Eric Lachaud
European Parliament
Ezrachi
F Rossi
Finale Doshi
G Andrew
G Hinton
G Piatetsky
G Ruddick
Gandy
Gandy
Gandy
Geoffrey Hinton
Goff
Goodman
Gorelick Skirpan
Gregory Piatetsky
H Lakkaraju
H Oscar
H Thomas
H Zemanek
Hannak
Heinz Zemanek
Himabindu Lakkaraju
I S Rubinstein
Ira S Rubinstein
J Buolamwini
J C Wong
J F Bonnefon
J Grimmelmann
J Kasperkevic
J R Quinlan
J Saunders
Jessica Saunders
Jouhki
Jross Quinlan
Julia Carrie Wong
K Hill
Karanasiou
Kashmir Hill
Kristian Lum
Kroll
L A Bygrave
L Busch
L Edwards
L Edwards
L Sweeney
Latanya Sweeney
Laurence Busch
Loomis V Wisconsin
Loomis V Wisconsin
Lum
M A Devito
M Birnhack
M Christopher
M K Ohlhausen
M Nielsen
M Rhoen
M Veale
Mendoza
Michael Nielsen
Michiel Rhoen
Mislove
N Andrew
N Diakopoulos
N Lee
N Sclater
N Ungerleider
Neal Ungerleider
Niall Sclater
Nushi
O&apos
O&apos
P Argyro
P M Napoli
Pardo
R Brakel
R Chirgwin
R Zangrandi
Richard Chirgwin
Robert Brauneis
Roberto Zangrandi
Rosamunde Van Brakel
S B Barnes
S B Kotsiantis
S Elvery
S Galhotra
S Halpern
S Levin
S Wachter
Sainyam Galhotra
Sandvig
Simon Elvery
Skirpan
T O&apos
T.-E Synodinou
Tatiana-Eleni Synodinou
Tristan Henderson
V Boliver
Vikki Boliver
W Jay
W Knight
Will Knight
X Wu
Xiaolin Wu
Y Lecun
Yann Lecun
Zafar
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Crossref