159 research outputs found

    An innovative wheel–rail contact model for railway vehicles under degraded adhesion conditions

    Get PDF

    Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach

    Get PDF
    Partially Observable Markov Decision Processes (POMDPs) are a powerful framework for planning under uncertainty. They allow to model state uncertainty as a belief probability distribution. Approximate solvers based on Monte Carlo sampling show great success to relax the computational demand and perform online planning. However, scaling to complex realistic domains with many actions and long planning horizons is still a major challenge, and a key point to achieve good performance is guiding the action-selection process with domain-dependent policy heuristics which are tailored for the specific application domain. We propose to learn high-quality heuristics from POMDP traces of executions generated by any solver. We convert the belief-action pairs to a logical semantics, and exploit data- and time-efficient Inductive Logic Programming (ILP) to generate interpretable belief-based policy specifications, which are then used as online heuristics. We evaluate thoroughly our methodology on two notoriously challenging POMDP problems, involving large action spaces and long planning horizons, namely, rocksample and pocman. Considering different state-of-the-art online POMDP solvers, including POMCP, DESPOT and AdaOPS, we show that learned heuristics expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specific heuristics within lower computational time. Moreover, they well generalize to more challenging scenarios not experienced in the training phase (e.g., increasing rocks and grid size in rocksample, incrementing the size of the map and the aggressivity of ghosts in pocman)

    From POMDP executions to policy specifications

    Get PDF
    Partially Observable Markov Decision Processes (POMDPs) allow modeling systems with uncertain state using probability distributions over states (called beliefs). However, in complex domains, POMDP solvers must explore large belief spaces, which is computationally intractable. One solution is to introduce domain knowledge to drive exploration, in the form of logic specifications. However, defining effective specifications may be challenging even for domain experts. We propose an approach based on inductive logic programming to learn specifications with confidence level from observed POMDP executions. We show that the learning approach converges to robust specifications as the number of examples increases

    Optimization of Potential Field Method Parameters through networks for Swarm Cooperative Manipulation Tasks

    Get PDF
    An interesting current research field related to autonomous robots is mobile manipulation performed by cooperating robots (in terrestrial, aerial and underwater environments). Focusing on the underwater scenario, cooperative manipulation of Intervention-Autonomous Underwater Vehicles (I-AUVs) is a complex and difficult application compared with the terrestrial or aerial ones because of many technical issues, such as underwater localization and limited communication. A decentralized approach for cooperative mobile manipulation of I-AUVs based on Artificial Neural Networks (ANNs) is proposed in this article. This strategy exploits the potential field method; a multi-layer control structure is developed to manage the coordination of the swarm, the guidance and navigation of I-AUVs and the manipulation task. In the article, this new strategy has been implemented in the simulation environment, simulating the transportation of an object. This object is moved along a desired trajectory in an unknown environment and it is transported by four underwater mobile robots, each one provided with a seven-degrees-of-freedom robotic arm. The simulation results are optimized thanks to the ANNs used for the potentials tuning

    Learning logic specifications for soft policy guidance in POMCP

    Get PDF
    Partially Observable Monte Carlo Planning (POMCP) is an effi- cient solver for Partially Observable Markov Decision Processes (POMDPs). It allows scaling to large state spaces by computing an approximation of the optimal policy locally and online, using a Monte Carlo Tree Search based strategy. However, POMCP suffers from sparse reward function, namely, rewards achieved only when the final goal is reached, particularly in environments with large state spaces and long horizons. Recently, logic specifications have been integrated into POMCP to guide exploration and to satisfy safety requirements. However, such policy-related rules require manual definition by domain experts, especially in real-world sce- narios. In this paper, we use inductive logic programming to learn logic specifications from traces of POMCP executions, i.e., sets of belief-action pairs generated by the planner. Specifically, we learn rules expressed in the paradigm of answer set programming. We then integrate them inside POMCP to provide soft policy bias toward promising actions. In the context of two benchmark sce- narios, rocksample and battery, we show that the integration of learned rules from small task instances can improve performance with fewer Monte Carlo simulations and in larger task instances. We make our modified version of POMCP publicly available at https://github.com/GiuMaz/pomcp_clingo.git

    Monte Carlo planning for mobile robots in large action spaces with velocity obstacles

    Get PDF
    Motion planning in dynamic environments is a challenging robotic task, requiring collision avoidance and real-time computation. State-of-the-art online methods as Velocity Obstacles (VO) guarantee safe local planning, while global planning methods based on reinforcement learning or graph discretization are either computationally inefficient or not provably collision-safe. In this paper, we combine Monte Carlo Tree Search (MCTS) with VO to prune unsafe actions (i.e., colliding velocities). In this way, we can plan with very few MCTS simulations even in very large action spaces (60 actions), achieving higher cumulative reward and lower computational time per step than pure MCTS with many simulations. Moreover, our methodology guarantees collision avoidance thanks to action pruning with VO, while pure MCTS does not. Results in this paper pave the way towards deployment of MCTS planning on real robots and multi-agent decentralized motion planning
    • …
    corecore