57,225 research outputs found

    Generalized planning: Non-deterministic abstractions and trajectory constraints

    Get PDF
    We study the characterization and computation of general policies for families of problems that share a structure characterized by a common reduction into a single abstract problem. Policies mu that solve the abstract problem P have been shown to solve all problems Q that reduce to P provided that mu terminates in Q. In this work, we shed light on why this termination condition is needed and how it can be removed. The key observation is that the abstract problem P captures the common structure among the concrete problems Q that is local (Markovian) but misses common structure that is global. We show how such global structure can be captured by means of trajectory constraints that in many cases can be expressed as LTL formulas, thus reducing generalized planning to LTL synthesis. Moreover, for a broad class of problems that involve integer variables that can be increased or decreased, trajectory constraints can be compiled away, reducing generalized planning to fully observable nondeterministic planning

    Probabilistic Hybrid Action Models for Predicting Concurrent Percept-driven Robot Behavior

    Full text link
    This article develops Probabilistic Hybrid Action Models (PHAMs), a realistic causal model for predicting the behavior generated by modern percept-driven robot plans. PHAMs represent aspects of robot behavior that cannot be represented by most action models used in AI planning: the temporal structure of continuous control processes, their non-deterministic effects, several modes of their interferences, and the achievement of triggering conditions in closed-loop robot plans. The main contributions of this article are: (1) PHAMs, a model of concurrent percept-driven behavior, its formalization, and proofs that the model generates probably, qualitatively accurate predictions; and (2) a resource-efficient inference method for PHAMs based on sampling projections from probabilistic action models and state descriptions. We show how PHAMs can be applied to planning the course of action of an autonomous robot office courier based on analytical and experimental results

    Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

    Get PDF
    Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents. However, most of these methods either make restrictive assumptions on the problem domain, or provide approximate solutions without any guarantees on quality. Methods in the former category typically build on heuristic search using upper bounds on the value function. Unfortunately, no techniques exist to compute such upper bounds for problems with non-factored value functions. To allow for meaningful benchmarking through measurable quality guarantees on a very general class of problems, this paper introduces a family of influence-optimistic upper bounds for factored decentralized partially observable Markov decision processes (Dec-POMDPs) that do not have factored value functions. Intuitively, we derive bounds on very large multiagent planning problems by subdividing them in sub-problems, and at each of these sub-problems making optimistic assumptions with respect to the influence that will be exerted by the rest of the system. We numerically compare the different upper bounds and demonstrate how we can achieve a non-trivial guarantee that a heuristic solution for problems with hundreds of agents is close to optimal. Furthermore, we provide evidence that the upper bounds may improve the effectiveness of heuristic influence search, and discuss further potential applications to multiagent planning.Comment: Long version of IJCAI 2015 paper (and extended abstract at AAMAS 2015

    Affective and cognitive prefrontal cortex projections to the lateral habenula in humans

    Full text link
    Anterior insula (AI) and dACC are known to process information about pain, loss, adversities, bad, harmful or suboptimal choices and consequences that threaten survival or well-being. Pain and loss activate also pregenual ACC (pgACC), linked to sad thoughts, hurt and regrets. The lateral habenula (LHb) is stimulated by predicted and received pain, discomfort, aversive outcome, loss. Its chronic stimulation makes us feel worse/low and gradually stops us choosing and moving for suboptimal, hurtful or punished choices, by direct and indirect (via RMTg) inhibition of DRN and VTA/SNc. Response selectivity of LHb neurons suggests their cortical input from affective and cognitive evaluative regions that make expectations about bad or suboptimal outcomes. Based on these facts I predicted direct corticohabenular projections from the dACC, pgACC and AI, as part of the adversity processing circuit that learns to avoid bad outcomes by suppressing dopamine and serotonin signal. Using DTI I found dACC, pgACC, AI, adjacent caudolateral and lateral OFC projections to LHb. I predicted no corticohabenular projections from the reward processing regions: medial OFC and vACC because both respond most strongly to good, high value stimuli and outcomes, inducing serotonin and dopamine release respectively. This lack of LHb projections was confirmed for vACC and likely for mOFC. The surprising findings were the corticohabenular projections from the cognitive prefrontal cortex regions, known for flexible reasoning, planning and combining whatever information are relevant for reaching current goals. I propose that prefrontohabenular projections provide a teaching signal for value-based choice behaviour, to learn to deselect, avoid or inhibit the potentially harmful, low valued or wrong choices, goals, strategies, predictions, models and ways of doing things, to prevent bad or suboptimal consequences.Comment: I renamed the medioventral part of the anterior thalamus via which the PFC to LHb fibre tracts from ventral anterior (AV) to medial anterior thalamic region. Apologies for that. My co-author decided to remove his nam

    Distributed Spacecraft Path Planning and Collision Avoidance via Reciprocal Velocity Obstacle Approach

    Get PDF
    This paper presents the development of a combined linear quadratic regulation and reciprocal velocity obstacle (LQR/RVO) control algorithm for multiple satellites during close proximity operations. The linear quadratic regulator (LQR) control effort drives the spacecraft towards their target position while the reciprocal velocity obstacle (RVO) provides collision avoidance capabilities. Each spacecraft maneuvers independently, without explicit communication or knowledge in term of collision avoidance decision making of the other spacecraft in the formation. To assess the performance of this novel controller different test cases are implemented. Numerical results show that this method guarantees safe and collision-free maneuvers for all the satellites in the formation and the control performance is presented in term of Δv and fuel consumption
    • …
    corecore