3 research outputs found

    LTLf/LDLf Non-Markovian Rewards

    Get PDF
    In Markov Decision Processes (MDPs), the reward obtained in a state is Markovian, i.e., depends on the last state and action. This dependency makes it difficult to reward more interesting long-term behaviors, such as always closing a door after it has been opened, or providing coffee only following a request. Extending MDPs to handle non-Markovian reward functions was the subject of two previous lines of work. Both use LTL variants to specify the reward function and then compile the new model back into a Markovian model. Building on recent progress in temporal logics over finite traces, we adopt LDLf for specifying non-Markovian rewards and provide an elegant automata construction for building a Markovian model, which extends that of previous work and offers strong minimality and compositionality guarantees

    Industrial Eng. and Management

    No full text
    Connecting between standard AI planning constructs and a classical cooperative model of transferable-utility coalition games, we introduce the notion of transferable-utility (TU) planning games. The key representational property of these games is that coalitions are valued implicitly based on their ability to carry out efficient joint plans. On the side of the expressiveness, we show that existing succinct representations of monotonic TU games can be efficiently compiled into TU planning games. On the side of computation, TU planning games allow us to provide some of the strongest to date tractability results for core-existence and core-membership queries in succinct TU coalition games
    corecore