57,225 research outputs found
Generalized planning: Non-deterministic abstractions and trajectory constraints
We study the characterization and computation of general policies for families of problems that share a structure characterized by a common reduction into a single abstract problem. Policies mu that solve the abstract problem P have been shown to solve all problems Q that reduce to P provided that mu terminates in Q. In this work, we shed light on why this termination condition is needed and how it can be removed. The key observation is that the abstract problem P captures the common structure among the concrete problems Q that is local (Markovian) but misses common structure that is global. We show how such global structure can be captured by means of trajectory constraints that in many cases can be expressed as LTL formulas, thus reducing generalized planning to LTL synthesis. Moreover, for a broad class of problems that involve integer variables that can be increased or decreased, trajectory constraints can be compiled away, reducing generalized planning to fully observable nondeterministic planning
Probabilistic Hybrid Action Models for Predicting Concurrent Percept-driven Robot Behavior
This article develops Probabilistic Hybrid Action Models (PHAMs), a realistic
causal model for predicting the behavior generated by modern percept-driven
robot plans. PHAMs represent aspects of robot behavior that cannot be
represented by most action models used in AI planning: the temporal structure
of continuous control processes, their non-deterministic effects, several modes
of their interferences, and the achievement of triggering conditions in
closed-loop robot plans.
The main contributions of this article are: (1) PHAMs, a model of concurrent
percept-driven behavior, its formalization, and proofs that the model generates
probably, qualitatively accurate predictions; and (2) a resource-efficient
inference method for PHAMs based on sampling projections from probabilistic
action models and state descriptions. We show how PHAMs can be applied to
planning the course of action of an autonomous robot office courier based on
analytical and experimental results
Influence-Optimistic Local Values for Multiagent Planning --- Extended Version
Recent years have seen the development of methods for multiagent planning
under uncertainty that scale to tens or even hundreds of agents. However, most
of these methods either make restrictive assumptions on the problem domain, or
provide approximate solutions without any guarantees on quality. Methods in the
former category typically build on heuristic search using upper bounds on the
value function. Unfortunately, no techniques exist to compute such upper bounds
for problems with non-factored value functions. To allow for meaningful
benchmarking through measurable quality guarantees on a very general class of
problems, this paper introduces a family of influence-optimistic upper bounds
for factored decentralized partially observable Markov decision processes
(Dec-POMDPs) that do not have factored value functions. Intuitively, we derive
bounds on very large multiagent planning problems by subdividing them in
sub-problems, and at each of these sub-problems making optimistic assumptions
with respect to the influence that will be exerted by the rest of the system.
We numerically compare the different upper bounds and demonstrate how we can
achieve a non-trivial guarantee that a heuristic solution for problems with
hundreds of agents is close to optimal. Furthermore, we provide evidence that
the upper bounds may improve the effectiveness of heuristic influence search,
and discuss further potential applications to multiagent planning.Comment: Long version of IJCAI 2015 paper (and extended abstract at AAMAS
2015
Affective and cognitive prefrontal cortex projections to the lateral habenula in humans
Anterior insula (AI) and dACC are known to process information about pain,
loss, adversities, bad, harmful or suboptimal choices and consequences that
threaten survival or well-being. Pain and loss activate also pregenual ACC
(pgACC), linked to sad thoughts, hurt and regrets. The lateral habenula (LHb)
is stimulated by predicted and received pain, discomfort, aversive outcome,
loss. Its chronic stimulation makes us feel worse/low and gradually stops us
choosing and moving for suboptimal, hurtful or punished choices, by direct and
indirect (via RMTg) inhibition of DRN and VTA/SNc. Response selectivity of LHb
neurons suggests their cortical input from affective and cognitive evaluative
regions that make expectations about bad or suboptimal outcomes. Based on these
facts I predicted direct corticohabenular projections from the dACC, pgACC and
AI, as part of the adversity processing circuit that learns to avoid bad
outcomes by suppressing dopamine and serotonin signal. Using DTI I found dACC,
pgACC, AI, adjacent caudolateral and lateral OFC projections to LHb. I
predicted no corticohabenular projections from the reward processing regions:
medial OFC and vACC because both respond most strongly to good, high value
stimuli and outcomes, inducing serotonin and dopamine release respectively.
This lack of LHb projections was confirmed for vACC and likely for mOFC. The
surprising findings were the corticohabenular projections from the cognitive
prefrontal cortex regions, known for flexible reasoning, planning and combining
whatever information are relevant for reaching current goals. I propose that
prefrontohabenular projections provide a teaching signal for value-based choice
behaviour, to learn to deselect, avoid or inhibit the potentially harmful, low
valued or wrong choices, goals, strategies, predictions, models and ways of
doing things, to prevent bad or suboptimal consequences.Comment: I renamed the medioventral part of the anterior thalamus via which
the PFC to LHb fibre tracts from ventral anterior (AV) to medial anterior
thalamic region. Apologies for that. My co-author decided to remove his nam
Distributed Spacecraft Path Planning and Collision Avoidance via Reciprocal Velocity Obstacle Approach
This paper presents the development of a combined linear quadratic regulation and reciprocal velocity obstacle (LQR/RVO) control algorithm for multiple satellites during close proximity operations. The linear quadratic regulator (LQR) control effort drives the spacecraft towards their target position while the reciprocal velocity obstacle (RVO) provides collision avoidance capabilities. Each spacecraft maneuvers independently, without explicit communication or knowledge in term of collision avoidance decision making of the other spacecraft in the formation. To assess the performance of this novel controller different test cases are implemented. Numerical results show that this method guarantees safe and collision-free maneuvers for all the satellites in the formation and the control performance is presented in term of Δv and fuel consumption
- …