554 research outputs found
Sensor Synthesis for POMDPs with Reachability Objectives
Partially observable Markov decision processes (POMDPs) are widely used in
probabilistic planning problems in which an agent interacts with an environment
using noisy and imprecise sensors. We study a setting in which the sensors are
only partially defined and the goal is to synthesize "weakest" additional
sensors, such that in the resulting POMDP, there is a small-memory policy for
the agent that almost-surely (with probability~1) satisfies a reachability
objective. We show that the problem is NP-complete, and present a symbolic
algorithm by encoding the problem into SAT instances. We illustrate trade-offs
between the amount of memory of the policy and the number of additional sensors
on a simple example. We have implemented our approach and consider three
classical POMDP examples from the literature, and show that in all the examples
the number of sensors can be significantly decreased (as compared to the
existing solutions in the literature) without increasing the complexity of the
policies.Comment: arXiv admin note: text overlap with arXiv:1511.0845
Energy Efficient Execution of POMDP Policies
Recent advances in planning techniques for partially observable Markov decision processes have focused on online search techniques and offline point-based value iteration. While these techniques allow practitioners to obtain policies for fairly large problems, they assume that a non-negligible amount of computation can be done between each decision point. In contrast, the recent proliferation of mobile and embedded devices has lead to a surge of applications that could benefit from state of the art planning techniques if they can operate under severe constraints on computational resources. To that effect, we describe two techniques to compile policies into controllers that can be executed by a mere table lookup at each decision point. The first approach compiles policies induced by a set of alpha vectors (such as those obtained by point-based techniques) into approximately equivalent controllers, while the second approach performs a simulation to compile arbitrary policies into approximately equivalent controllers. We also describe an approach to compress controllers by removing redundant and dominated nodes, often yielding smaller and yet better controllers. Further compression and higher value can sometimes be obtained by considering stochastic controllers. The compilation and compression techniques are demonstrated on benchmark problems as well as a mobile application to help persons with Alzheimer's to way-find. The battery consumption of several POMDP policies is compared against finite-state controllers learned using methods introduced in this paper. Experiments performed on the Nexus 4 phone show that finite-state controllers are the least battery consuming POMDP policies
Relational Approach to Knowledge Engineering for POMDP-based Assistance Systems as a Translation of a Psychological Model
Assistive systems for persons with cognitive disabilities (e.g. dementia) are
difficult to build due to the wide range of different approaches people can
take to accomplishing the same task, and the significant uncertainties that
arise from both the unpredictability of client's behaviours and from noise in
sensor readings. Partially observable Markov decision process (POMDP) models
have been used successfully as the reasoning engine behind such assistive
systems for small multi-step tasks such as hand washing. POMDP models are a
powerful, yet flexible framework for modelling assistance that can deal with
uncertainty and utility. Unfortunately, POMDPs usually require a very labour
intensive, manual procedure for their definition and construction. Our previous
work has described a knowledge driven method for automatically generating POMDP
activity recognition and context sensitive prompting systems for complex tasks.
We call the resulting POMDP a SNAP (SyNdetic Assistance Process). The
spreadsheet-like result of the analysis does not correspond to the POMDP model
directly and the translation to a formal POMDP representation is required. To
date, this translation had to be performed manually by a trained POMDP expert.
In this paper, we formalise and automate this translation process using a
probabilistic relational model (PRM) encoded in a relational database. We
demonstrate the method by eliciting three assistance tasks from non-experts. We
validate the resulting POMDP models using case-based simulations to show that
they are reasonable for the domains. We also show a complete case study of a
designer specifying one database, including an evaluation in a real-life
experiment with a human actor
Learning and Reasoning for Robot Sequential Decision Making under Uncertainty
Robots frequently face complex tasks that require more than one action, where
sequential decision-making (SDM) capabilities become necessary. The key
contribution of this work is a robot SDM framework, called LCORPP, that
supports the simultaneous capabilities of supervised learning for passive state
estimation, automated reasoning with declarative human knowledge, and planning
under uncertainty toward achieving long-term goals. In particular, we use a
hybrid reasoning paradigm to refine the state estimator, and provide
informative priors for the probabilistic planner. In experiments, a mobile
robot is tasked with estimating human intentions using their motion
trajectories, declarative contextual knowledge, and human-robot interaction
(dialog-based and motion-based). Results suggest that, in efficiency and
accuracy, our framework performs better than its no-learning and no-reasoning
counterparts in office environment.Comment: In proceedings of 34th AAAI conference on Artificial Intelligence,
202
KR: An Architecture for Knowledge Representation and Reasoning in Robotics
This paper describes an architecture that combines the complementary
strengths of declarative programming and probabilistic graphical models to
enable robots to represent, reason with, and learn from, qualitative and
quantitative descriptions of uncertainty and knowledge. An action language is
used for the low-level (LL) and high-level (HL) system descriptions in the
architecture, and the definition of recorded histories in the HL is expanded to
allow prioritized defaults. For any given goal, tentative plans created in the
HL using default knowledge and commonsense reasoning are implemented in the LL
using probabilistic algorithms, with the corresponding observations used to
update the HL history. Tight coupling between the two levels enables automatic
selection of relevant variables and generation of suitable action policies in
the LL for each HL action, and supports reasoning with violation of defaults,
noisy observations and unreliable actions in large and complex domains. The
architecture is evaluated in simulation and on physical robots transporting
objects in indoor domains; the benefit on robots is a reduction in task
execution time of 39% compared with a purely probabilistic, but still
hierarchical, approach.Comment: The paper appears in the Proceedings of the 15th International
Workshop on Non-Monotonic Reasoning (NMR 2014
Structured Possibilistic Planning Using Decision Diagrams
National audienceQualitative Possibilistic Mixed-Observable MDPs (π-MOMDPs), generalizing π-MDPs and π-POMDPs, are well-suited models to planning under uncertainty with mixed-observability when transition, observation and reward functions are not precisely known and can be qualitatively described. Functions defining the model as well as intermediate calculations are valued in a finite possibilistic scale L, which induces a finite belief state space under partial observability contrary to its probabilistic counterpart. In this paper, we propose the first study of factored π-MOMDP models in order to solve large structured planning problems under qualitative uncertainty, or considered as qualitative approximations of probabilistic problems. Building upon the SPUDD algorithm for solving factored (probabilistic) MDPs, we conceived a symbolic algorithm named PPUDD for solving factored π-MOMDPs. Whereas SPUDD’s decision diagrams’ leaves may be as large as the state space since their values are real numbers aggregated through additions and multiplications, PPUDD’s ones always remain in the finite scale L via min and max operations only. Our experiments show that PPUDD’s computation time is much lower than SPUDD, Symbolic-HSVI and APPL for possibilistic and probabilistic versions of the same benchmarks under either total or mixed observability, while still providing high-quality policies
A Survey of Knowledge-based Sequential Decision Making under Uncertainty
Reasoning with declarative knowledge (RDK) and sequential decision-making
(SDM) are two key research areas in artificial intelligence. RDK methods reason
with declarative domain knowledge, including commonsense knowledge, that is
either provided a priori or acquired over time, while SDM methods
(probabilistic planning and reinforcement learning) seek to compute action
policies that maximize the expected cumulative utility over a time horizon;
both classes of methods reason in the presence of uncertainty. Despite the rich
literature in these two areas, researchers have not fully explored their
complementary strengths. In this paper, we survey algorithms that leverage RDK
methods while making sequential decisions under uncertainty. We discuss
significant developments, open problems, and directions for future work
- …