44,860 research outputs found

    Perseus: Randomized Point-based Value Iteration for POMDPs

    Full text link
    Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other point-based methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems

    Design Guidelines for Agent Based Model Visualization

    Get PDF
    In the field of agent-based modeling (ABM), visualizations play an important role in identifying, communicating and understanding important behavior of the modeled phenomenon. However, many modelers tend to create ineffective visualizations of Agent Based Models (ABM) due to lack of experience with visual design. This paper provides ABM visualization design guidelines in order to improve visual design with ABM toolkits. These guidelines will assist the modeler in creating clear and understandable ABM visualizations. We begin by introducing a non-hierarchical categorization of ABM visualizations. This categorization serves as a starting point in the creation of an ABM visualization. We go on to present well-known design techniques in the context of ABM visualization. These techniques are based on Gestalt psychology, semiology of graphics, and scientific visualization. They improve the visualization design by facilitating specific tasks, and providing a common language to critique visualizations through the use of visual variables. Subsequently, we discuss the application of these design techniques to simplify, emphasize and explain an ABM visualization. Finally, we illustrate these guidelines using a simple redesign of a NetLogo ABM visualization. These guidelines can be used to inform the development of design tools that assist users in the creation of ABM visualizations.Visualization, Design, Graphics, Guidelines, Communication, Agent-Based Modeling

    A vision-guided parallel parking system for a mobile robot using approximate policy iteration

    Get PDF
    Reinforcement Learning (RL) methods enable autonomous robots to learn skills from scratch by interacting with the environment. However, reinforcement learning can be very time consuming. This paper focuses on accelerating the reinforcement learning process on a mobile robot in an unknown environment. The presented algorithm is based on approximate policy iteration with a continuous state space and a fixed number of actions. The action-value function is represented by a weighted combination of basis functions. Furthermore, a complexity analysis is provided to show that the implemented approach is guaranteed to converge on an optimal policy with less computational time. A parallel parking task is selected for testing purposes. In the experiments, the efficiency of the proposed approach is demonstrated and analyzed through a set of simulated and real robot experiments, with comparison drawn from two well known algorithms (Dyna-Q and Q-learning)

    Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder

    Full text link
    In this paper, we present a hierarchical path planning framework called SG-RL (subgoal graphs-reinforcement learning), to plan rational paths for agents maneuvering in continuous and uncertain environments. By "rational", we mean (1) efficient path planning to eliminate first-move lags; (2) collision-free and smooth for agents with kinematic constraints satisfied. SG-RL works in a two-level manner. At the first level, SG-RL uses a geometric path-planning method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract paths, also called subgoal sequences. At the second level, SG-RL uses an RL method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal motion-planning policies which can generate kinematically feasible and collision-free trajectories between adjacent subgoals. The first advantage of the proposed method is that SSG can solve the limitations of sparse reward and local minima trap for RL agents; thus, LSPI can be used to generate paths in complex environments. The second advantage is that, when the environment changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI can deal with uncertainties by exploiting its generalization ability to handle changes in environments. Simulation experiments in representative scenarios demonstrate that, compared with existing methods, SG-RL can work well on large-scale maps with relatively low action-switching frequencies and shorter path lengths, and SG-RL can deal with small changes in environments. We further demonstrate that the design of reward functions and the types of training environments are important factors for learning feasible policies.Comment: 20 page

    Bayesian Quadratic Network Game Filters

    Full text link
    A repeated network game where agents have quadratic utilities that depend on information externalities -- an unknown underlying state -- as well as payoff externalities -- the actions of all other agents in the network -- is considered. Agents play Bayesian Nash Equilibrium strategies with respect to their beliefs on the state of the world and the actions of all other nodes in the network. These beliefs are refined over subsequent stages based on the observed actions of neighboring peers. This paper introduces the Quadratic Network Game (QNG) filter that agents can run locally to update their beliefs, select corresponding optimal actions, and eventually learn a sufficient statistic of the network's state. The QNG filter is demonstrated on a Cournot market competition game and a coordination game to implement navigation of an autonomous team

    Vision-based interface applied to assistive robots

    Get PDF
    This paper presents two vision-based interfaces for disabled people to command a mobile robot for personal assistance. The developed interfaces can be subdivided according to the algorithm of image processing implemented for the detection and tracking of two different body regions. The first interface detects and tracks movements of the user's head, and these movements are transformed into linear and angular velocities in order to command a mobile robot. The second interface detects and tracks movements of the user's hand, and these movements are similarly transformed. In addition, this paper also presents the control laws for the robot. The experimental results demonstrate good performance and balance between complexity and feasibility for real-time applications.Fil: Pérez Berenguer, María Elisa. Universidad Nacional de San Juan. Facultad de Ingeniería. Departamento de Electrónica y Automática. Gabinete de Tecnología Médica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Soria, Carlos Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Juan. Instituto de Automática. Universidad Nacional de San Juan. Facultad de Ingeniería. Instituto de Automática; ArgentinaFil: López Celani, Natalia Martina. Universidad Nacional de San Juan. Facultad de Ingeniería. Departamento de Electrónica y Automática. Gabinete de Tecnología Médica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Nasisi, Oscar Herminio. Universidad Nacional de San Juan. Facultad de Ingeniería. Instituto de Automática; ArgentinaFil: Mut, Vicente Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Juan. Instituto de Automática. Universidad Nacional de San Juan. Facultad de Ingeniería. Instituto de Automática; Argentin

    Closed-loop Bayesian Semantic Data Fusion for Collaborative Human-Autonomy Target Search

    Full text link
    In search applications, autonomous unmanned vehicles must be able to efficiently reacquire and localize mobile targets that can remain out of view for long periods of time in large spaces. As such, all available information sources must be actively leveraged -- including imprecise but readily available semantic observations provided by humans. To achieve this, this work develops and validates a novel collaborative human-machine sensing solution for dynamic target search. Our approach uses continuous partially observable Markov decision process (CPOMDP) planning to generate vehicle trajectories that optimally exploit imperfect detection data from onboard sensors, as well as semantic natural language observations that can be specifically requested from human sensors. The key innovation is a scalable hierarchical Gaussian mixture model formulation for efficiently solving CPOMDPs with semantic observations in continuous dynamic state spaces. The approach is demonstrated and validated with a real human-robot team engaged in dynamic indoor target search and capture scenarios on a custom testbed.Comment: Final version accepted and submitted to 2018 FUSION Conference (Cambridge, UK, July 2018
    corecore