44,860 research outputs found
Perseus: Randomized Point-based Value Iteration for POMDPs
Partially observable Markov decision processes (POMDPs) form an attractive
and principled framework for agent planning under uncertainty. Point-based
approximate techniques for POMDPs compute a policy based on a finite set of
points collected in advance from the agents belief space. We present a
randomized point-based value iteration algorithm called Perseus. The algorithm
performs approximate value backup stages, ensuring that in each backup stage
the value of each point in the belief set is improved; the key observation is
that a single backup may improve the value of many belief points. Contrary to
other point-based methods, Perseus backs up only a (randomly selected) subset
of points in the belief set, sufficient for improving the value of each belief
point in the set. We show how the same idea can be extended to dealing with
continuous action spaces. Experimental results show the potential of Perseus in
large scale POMDP problems
Design Guidelines for Agent Based Model Visualization
In the field of agent-based modeling (ABM), visualizations play an important role in identifying, communicating and understanding important behavior of the modeled phenomenon. However, many modelers tend to create ineffective visualizations of Agent Based Models (ABM) due to lack of experience with visual design. This paper provides ABM visualization design guidelines in order to improve visual design with ABM toolkits. These guidelines will assist the modeler in creating clear and understandable ABM visualizations. We begin by introducing a non-hierarchical categorization of ABM visualizations. This categorization serves as a starting point in the creation of an ABM visualization. We go on to present well-known design techniques in the context of ABM visualization. These techniques are based on Gestalt psychology, semiology of graphics, and scientific visualization. They improve the visualization design by facilitating specific tasks, and providing a common language to critique visualizations through the use of visual variables. Subsequently, we discuss the application of these design techniques to simplify, emphasize and explain an ABM visualization. Finally, we illustrate these guidelines using a simple redesign of a NetLogo ABM visualization. These guidelines can be used to inform the development of design tools that assist users in the creation of ABM visualizations.Visualization, Design, Graphics, Guidelines, Communication, Agent-Based Modeling
A vision-guided parallel parking system for a mobile robot using approximate policy iteration
Reinforcement Learning (RL) methods enable autonomous robots to learn skills from scratch by interacting with the environment. However, reinforcement learning can be very time consuming. This paper focuses on accelerating the reinforcement learning process on a mobile robot in an unknown environment. The presented algorithm is based on approximate policy iteration with a continuous state space and a fixed number of actions. The action-value function is represented by a weighted combination of basis functions.
Furthermore, a complexity analysis is provided to show that the implemented approach is guaranteed to converge on an optimal policy with less computational time.
A parallel parking task is selected for testing purposes. In the experiments, the efficiency of the proposed approach is demonstrated and analyzed through a set of simulated and real robot experiments, with comparison drawn from two well known algorithms (Dyna-Q and Q-learning)
Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder
In this paper, we present a hierarchical path planning framework called SG-RL
(subgoal graphs-reinforcement learning), to plan rational paths for agents
maneuvering in continuous and uncertain environments. By "rational", we mean
(1) efficient path planning to eliminate first-move lags; (2) collision-free
and smooth for agents with kinematic constraints satisfied. SG-RL works in a
two-level manner. At the first level, SG-RL uses a geometric path-planning
method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract
paths, also called subgoal sequences. At the second level, SG-RL uses an RL
method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal
motion-planning policies which can generate kinematically feasible and
collision-free trajectories between adjacent subgoals. The first advantage of
the proposed method is that SSG can solve the limitations of sparse reward and
local minima trap for RL agents; thus, LSPI can be used to generate paths in
complex environments. The second advantage is that, when the environment
changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to
reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI
can deal with uncertainties by exploiting its generalization ability to handle
changes in environments. Simulation experiments in representative scenarios
demonstrate that, compared with existing methods, SG-RL can work well on
large-scale maps with relatively low action-switching frequencies and shorter
path lengths, and SG-RL can deal with small changes in environments. We further
demonstrate that the design of reward functions and the types of training
environments are important factors for learning feasible policies.Comment: 20 page
Bayesian Quadratic Network Game Filters
A repeated network game where agents have quadratic utilities that depend on
information externalities -- an unknown underlying state -- as well as payoff
externalities -- the actions of all other agents in the network -- is
considered. Agents play Bayesian Nash Equilibrium strategies with respect to
their beliefs on the state of the world and the actions of all other nodes in
the network. These beliefs are refined over subsequent stages based on the
observed actions of neighboring peers. This paper introduces the Quadratic
Network Game (QNG) filter that agents can run locally to update their beliefs,
select corresponding optimal actions, and eventually learn a sufficient
statistic of the network's state. The QNG filter is demonstrated on a Cournot
market competition game and a coordination game to implement navigation of an
autonomous team
Vision-based interface applied to assistive robots
This paper presents two vision-based interfaces for disabled people to command a mobile robot for personal assistance. The developed interfaces can be subdivided according to the algorithm of image processing implemented for the detection and tracking of two different body regions. The first interface detects and tracks movements of the user's head, and these movements are transformed into linear and angular velocities in order to command a mobile robot. The second interface detects and tracks movements of the user's hand, and these movements are similarly transformed. In addition, this paper also presents the control laws for the robot. The experimental results demonstrate good performance and balance between complexity and feasibility for real-time applications.Fil: Pérez Berenguer, María Elisa. Universidad Nacional de San Juan. Facultad de Ingeniería. Departamento de Electrónica y Automática. Gabinete de Tecnología Médica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Soria, Carlos Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Juan. Instituto de Automática. Universidad Nacional de San Juan. Facultad de Ingeniería. Instituto de Automática; ArgentinaFil: López Celani, Natalia Martina. Universidad Nacional de San Juan. Facultad de Ingeniería. Departamento de Electrónica y Automática. Gabinete de Tecnología Médica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Nasisi, Oscar Herminio. Universidad Nacional de San Juan. Facultad de Ingeniería. Instituto de Automática; ArgentinaFil: Mut, Vicente Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Juan. Instituto de Automática. Universidad Nacional de San Juan. Facultad de Ingeniería. Instituto de Automática; Argentin
Closed-loop Bayesian Semantic Data Fusion for Collaborative Human-Autonomy Target Search
In search applications, autonomous unmanned vehicles must be able to
efficiently reacquire and localize mobile targets that can remain out of view
for long periods of time in large spaces. As such, all available information
sources must be actively leveraged -- including imprecise but readily available
semantic observations provided by humans. To achieve this, this work develops
and validates a novel collaborative human-machine sensing solution for dynamic
target search. Our approach uses continuous partially observable Markov
decision process (CPOMDP) planning to generate vehicle trajectories that
optimally exploit imperfect detection data from onboard sensors, as well as
semantic natural language observations that can be specifically requested from
human sensors. The key innovation is a scalable hierarchical Gaussian mixture
model formulation for efficiently solving CPOMDPs with semantic observations in
continuous dynamic state spaces. The approach is demonstrated and validated
with a real human-robot team engaged in dynamic indoor target search and
capture scenarios on a custom testbed.Comment: Final version accepted and submitted to 2018 FUSION Conference
(Cambridge, UK, July 2018
- …