36 research outputs found

    Parameter estimation in softmax decision-making models with linear objective functions

    Full text link
    With an eye towards human-centered automation, we contribute to the development of a systematic means to infer features of human decision-making from behavioral data. Motivated by the common use of softmax selection in models of human decision-making, we study the maximum likelihood parameter estimation problem for softmax decision-making models with linear objective functions. We present conditions under which the likelihood function is convex. These allow us to provide sufficient conditions for convergence of the resulting maximum likelihood estimator and to construct its asymptotic distribution. In the case of models with nonlinear objective functions, we show how the estimator can be applied by linearizing about a nominal parameter value. We apply the estimator to fit the stochastic UCL (Upper Credible Limit) model of human decision-making to human subject data. We show statistically significant differences in behavior across related, but distinct, tasks.Comment: In pres

    Satisficing in multi-armed bandit problems

    Full text link
    Satisficing is a relaxation of maximizing and allows for less risky decision making in the face of uncertainty. We propose two sets of satisficing objectives for the multi-armed bandit problem, where the objective is to achieve reward-based decision-making performance above a given threshold. We show that these new problems are equivalent to various standard multi-armed bandit problems with maximizing objectives and use the equivalence to find bounds on performance. The different objectives can result in qualitatively different behavior; for example, agents explore their options continually in one case and only a finite number of times in another. For the case of Gaussian rewards we show an additional equivalence between the two sets of satisficing objectives that allows algorithms developed for one set to be applied to the other. We then develop variants of the Upper Credible Limit (UCL) algorithm that solve the problems with satisficing objectives and show that these modified UCL algorithms achieve efficient satisficing performance.Comment: To appear in IEEE Transactions on Automatic Contro

    A Dynamical System for Prioritizing and Coordinating Motivations

    Get PDF
    We develop a dynamical systems approach to prioritizing multiple tasks in the context of a mobile robot. We take navigation as our prototypical task, and use vector field planners derived from navigation functions to encode control policies that achieve each individual task. We associate a scalar quantity with each task, representing its current importance to the robot; this value evolves in time as the robot achieves tasks. In our framework, the robot uses as its control input a convex combination of the individual task vector fields. The weights of the convex combination evolve dynamically according to a decision model adapted from the bio-inspired literature on swarm decision making, using the task values as an input. We study a simple case with two navigation tasks and derive conditions under which a stable limit cycle can be proven to emerge. While owing along the limit cycle, the robot periodically navigates to each of the two goal locations; moreover, numerical study suggests that the basin of attraction is quite large so that significant perturbations are recovered with a reliable return to the desired task coordination pattern. For more information: Kod*lab and http://www.paulreverdy.com/2018/05/11/motivation-dynamics-simulations

    Mobile Robots as Remote Sensors for Spatial Point Process Models

    Get PDF
    Spatial point process models are a commonly-used statistical tool for studying the distribution of objects of interest in a domain. We study the problem of deploying mobile robots as remote sensors to estimate the parameters of such a model, in particular the intensity parameter lambda which measures the mean density of points in a Poisson point process. This problem requires covering an appropriately large section of the domain while avoiding the objects, which we treat as obstacles. We develop a control law that covers an expanding section of the domain and an online criterion for determining when to stop sampling, i.e., when the covered area is large enough to achieve a desired level of estimation accuracy, and illustrate the resulting system with numerical simulations

    Motivation dynamics for autonomous composition of navigation tasks

    Get PDF
    We physically demonstrate a reactive sensorimotor architecture for mobile robots whose behaviors are generated by motivation dynamics. Motivation dynamics uses a continuous dynamical system to reactively compose low-level control vector fields using valuation functions which capture the potentially competing influences of external stimuli relative to the system\u27s own internal state. We show that motivation dynamics 1) naturally accommodates external stimuli through standard signal processing tools, and 2) can effectively encode a repetitive higher-level task by composing several low-level controllers to achieve a limit cycle in which the robot repeatedly navigates towards two alternatively valuable goal locations in a commensurately alternating order. We show that these behaviors are robust to perturbations including imperfect models of robot kinematics, sensor noise, and disturbances resulting from the need to traverse difficult terrain. We argue that motivation dynamics can provide a useful alternative to controllers based on hybrid automata in situations where the control operates at a low level close to the physical hardware. For more information: Kod*la

    Spatial Sampling Strategies with Multiple Scientific Frames of Reference

    Get PDF
    We study the spatial sampling strategies employed by field scientists studying aeolian processes, which are geophysical interactions between wind and terrain. As in geophysical field science in general, observations of aeolian processes are made and data gathered by carrying instruments to various locations and then deciding when and where to record a measurement. We focus on this decision-making process. Because sampling is physically laborious and time consuming, scientists often develop sampling plans in advance of deployment, i.e., employ an offline decision-making process. However, because of the unpredictable nature of field conditions, sampling strategies generally have to be updated online. By studying data from a large field deployment, we show that the offline strategies often consist of sampling along linear segments of physical space, called transects. We proceed by studying the sampling pattern on individual transects. For a given transect, we formulate model-based hypotheses that the scientists may be testing and derive sampling strategies that result in optimal hypothesis tests. Different underlying models lead to qualitatively different optimal sampling behavior. There is a clear mismatch between our first optimal sampling strategy and observed behavior, leading us to conjecture about other, more sophisticated hypothesis tests that may be driving expert decision-making behavior. For more information: Kod*la

    A drift-diffusion model for robotic obstacle avoidance

    Get PDF
    We develop a stochastic framework for modeling and analysis of robot navigation in the presence of obstacles. We show that, with appropriate assumptions, the probability of a robot avoiding a given obstacle can be reduced to a function of a single dimensionless parameter which captures all relevant quantities of the problem. This parameter is analogous to the Peclet number considered in the literature on mass transport in advection-diffusion fluid flows. Using the framework we also compute statistics of the time required to escape an obstacle in an informative case. The results of the computation show that adding noise to the navigation strategy can improve performance. Finally, we present experimental results that illustrate these performance improvements on a robotic platform. For more information: Kod*La
    corecore