12 research outputs found

    Fitted Q-Learning for Relational Domains

    Full text link
    We consider the problem of Approximate Dynamic Programming in relational domains. Inspired by the success of fitted Q-learning methods in propositional settings, we develop the first relational fitted Q-learning algorithms by representing the value function and Bellman residuals. When we fit the Q-functions, we show how the two steps of Bellman operator; application and projection steps can be performed using a gradient-boosting technique. Our proposed framework performs reasonably well on standard domains without using domain models and using fewer training trajectories.Comment: 10 pages, 12 figure

    The Complexity of Reasoning with FODD and GFODD

    Full text link
    Recent work introduced Generalized First Order Decision Diagrams (GFODD) as a knowledge representation that is useful in mechanizing decision theoretic planning in relational domains. GFODDs generalize function-free first order logic and include numerical values and numerical generalizations of existential and universal quantification. Previous work presented heuristic inference algorithms for GFODDs and implemented these heuristics in systems for decision theoretic planning. In this paper, we study the complexity of the computational problems addressed by such implementations. In particular, we study the evaluation problem, the satisfiability problem, and the equivalence problem for GFODDs under the assumption that the size of the intended model is given with the problem, a restriction that guarantees decidability. Our results provide a complete characterization placing these problems within the polynomial hierarchy. The same characterization applies to the corresponding restriction of problems in first order logic, giving an interesting new avenue for efficient inference when the number of objects is bounded. Our results show that for Σk\Sigma_k formulas, and for corresponding GFODDs, evaluation and satisfiability are Σkp\Sigma_k^p complete, and equivalence is Πk+1p\Pi_{k+1}^p complete. For Πk\Pi_k formulas evaluation is Πkp\Pi_k^p complete, satisfiability is one level higher and is Σk+1p\Sigma_{k+1}^p complete, and equivalence is Πk+1p\Pi_{k+1}^p complete.Comment: A short version of this paper appears in AAAI 2014. Version 2 includes a reorganization and some expanded proof

    Planning in hybrid relational MDPs

    Get PDF
    We study planning in relational Markov decision processes involving discrete and continuous states and actions, and an unknown number of objects. This combination of hybrid relational domains has so far not received a lot of attention. While both relational and hybrid approaches have been studied separately, planning in such domains is still challenging and often requires restrictive assumptions and approximations. We propose HYPE: a sample-based planner for hybrid relational domains that combines model-based approaches with state abstraction. HYPE samples episodes and uses the previous episodes as well as the model to approximate the Q-function. In addition, abstraction is performed for each sampled episode, this removes the complexity of symbolic approaches for hybrid relational domains. In our empirical evaluations, we show that HYPE is a general and widely applicable planner in domains ranging from strictly discrete to strictly continuous to hybrid ones, handles intricacies such as unknown objects and relational models. Moreover, empirical results showed that abstraction provides significant improvements.status: publishe
    corecore