5,143 research outputs found
Parameter estimation in softmax decision-making models with linear objective functions
With an eye towards human-centered automation, we contribute to the
development of a systematic means to infer features of human decision-making
from behavioral data. Motivated by the common use of softmax selection in
models of human decision-making, we study the maximum likelihood parameter
estimation problem for softmax decision-making models with linear objective
functions. We present conditions under which the likelihood function is convex.
These allow us to provide sufficient conditions for convergence of the
resulting maximum likelihood estimator and to construct its asymptotic
distribution. In the case of models with nonlinear objective functions, we show
how the estimator can be applied by linearizing about a nominal parameter
value. We apply the estimator to fit the stochastic UCL (Upper Credible Limit)
model of human decision-making to human subject data. We show statistically
significant differences in behavior across related, but distinct, tasks.Comment: In pres
Stick-Breaking Policy Learning in Dec-POMDPs
Expectation maximization (EM) has recently been shown to be an efficient
algorithm for learning finite-state controllers (FSCs) in large decentralized
POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often
converge to maxima that are far from optimal. This paper considers a
variable-size FSC to represent the local policy of each agent. These
variable-size FSCs are constructed using a stick-breaking prior, leading to a
new framework called \emph{decentralized stick-breaking policy representation}
(Dec-SBPR). This approach learns the controller parameters with a variational
Bayesian algorithm without having to assume that the Dec-POMDP model is
available. The performance of Dec-SBPR is demonstrated on several benchmark
problems, showing that the algorithm scales to large problems while
outperforming other state-of-the-art methods
- …