Search CORE

507 research outputs found

Planning in hybrid relational MDPs

Author: AK Mausam
C Wang
Davide Nitti
J Lloyd
Luc De Raedt
M Kearns
M Wiering
N Meuleau
R Givan
RS Sutton
S Džeroski
S Hölldobler
T Lang
Tinne De Laet
U Nilsson
Vaishak Belle
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/09/2017
Field of study

We study planning in relational Markov decision processes involving discrete and continuous states and actions, and an unknown number of objects. This combination of hybrid relational domains has so far not received a lot of attention. While both relational and hybrid approaches have been studied separately, planning in such domains is still challenging and often requires restrictive assumptions and approximations. We propose HYPE: a sample-based planner for hybrid relational domains that combines model-based approaches with state abstraction. HYPE samples episodes and uses the previous episodes as well as the model to approximate the Q-function. In addition, abstraction is performed for each sampled episode, this removes the complexity of symbolic approaches for hybrid relational domains. In our empirical evaluations, we show that HYPE is a general and widely applicable planner in domains ranging from strictly discrete to strictly continuous to hybrid ones, handles intricacies such as unknown objects and relational models. Moreover, empirical results showed that abstraction provides significant improvements.status: publishe

Lirias

Crossref

Edinburgh Research Explorer

Solving Factored MDPs with Hybrid State and Action Variables

Author: Guestrin C.
Hauskrecht M.
Kveton B.
Publication venue: 'AI Access Foundation'
Publication date: 30/09/2011
Field of study

Efficient representations and solutions for large decision problems with continuous and discrete variables are among the most important challenges faced by the designers of automated decision support systems. In this paper, we describe a novel hybrid factored Markov decision process (MDP) model that allows for a compact representation of these problems, and a new hybrid approximate linear programming (HALP) framework that permits their efficient solutions. The central idea of HALP is to approximate the optimal value function by a linear combination of basis functions and optimize its weights by linear programming. We analyze both theoretical and computational aspects of this approach, and demonstrate its scale-up potential on several hybrid optimization problems

arXiv.org e-Print Archive

Crossref

A Review of Symbolic, Subsymbolic and Hybrid Methods for Sequential Decision Making

Author: Fernández-Olivares Juan
Mesejo Pablo
Núñez-Molina Carlos
Publication venue
Publication date: 20/04/2023
Field of study

The field of Sequential Decision Making (SDM) provides tools for solving Sequential Decision Processes (SDPs), where an agent must make a series of decisions in order to complete a task or achieve a goal. Historically, two competing SDM paradigms have view for supremacy. Automated Planning (AP) proposes to solve SDPs by performing a reasoning process over a model of the world, often represented symbolically. Conversely, Reinforcement Learning (RL) proposes to learn the solution of the SDP from data, without a world model, and represent the learned knowledge subsymbolically. In the spirit of reconciliation, we provide a review of symbolic, subsymbolic and hybrid methods for SDM. We cover both methods for solving SDPs (e.g., AP, RL and techniques that learn to plan) and for learning aspects of their structure (e.g., world models, state invariants and landmarks). To the best of our knowledge, no other review in the field provides the same scope. As an additional contribution, we discuss what properties an ideal method for SDM should exhibit and argue that neurosymbolic AI is the current approach which most closely resembles this ideal method. Finally, we outline several proposals to advance the field of SDM via the integration of symbolic and subsymbolic AI

arXiv.org e-Print Archive

Reinforcement Learning with Parameterized Actions

Author: Konidaris George
Masson Warwick
Ranchod Pravesh
Publication venue
Publication date: 26/11/2015
Field of study

We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy search in the goal-scoring and Platform domains.Comment: Accepted for AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Survey of Knowledge-based Sequential Decision Making under Uncertainty

Author: Sridharan Mohan
Zhang Shiqi
Publication venue
Publication date: 16/09/2020
Field of study

Reasoning with declarative knowledge (RDK) and sequential decision-making (SDM) are two key research areas in artificial intelligence. RDK methods reason with declarative domain knowledge, including commonsense knowledge, that is either provided a priori or acquired over time, while SDM methods (probabilistic planning and reinforcement learning) seek to compute action policies that maximize the expected cumulative utility over a time horizon; both classes of methods reason in the presence of uncertainty. Despite the rich literature in these two areas, researchers have not fully explored their complementary strengths. In this paper, we survey algorithms that leverage RDK methods while making sequential decisions under uncertainty. We discuss significant developments, open problems, and directions for future work

arXiv.org e-Print Archive

University of Birmingham Research Portal