100 research outputs found

    Online Tool Selection with Learned Grasp Prediction Models

    Full text link
    Deep learning-based grasp prediction models have become an industry standard for robotic bin-picking systems. To maximize pick success, production environments are often equipped with several end-effector tools that can be swapped on-the-fly, based on the target object. Tool-change, however, takes time. Choosing the order of grasps to perform, and corresponding tool-change actions, can improve system throughput; this is the topic of our work. The main challenge in planning tool change is uncertainty - we typically cannot see objects in the bin that are currently occluded. Inspired by queuing and admission control problems, we model the problem as a Markov Decision Process (MDP), where the goal is to maximize expected throughput, and we pursue an approximate solution based on model predictive control, where at each time step we plan based only on the currently visible objects. Special to our method is the idea of void zones, which are geometrical boundaries in which an unknown object will be present, and therefore cannot be accounted for during planning. Our planning problem can be solved using integer linear programming (ILP). However, we find that an approximate solution based on sparse tree search yields near optimal performance at a fraction of the time. Another question that we explore is how to measure the performance of tool-change planning: we find that throughput alone can fail to capture delicate and smooth behavior, and propose a principled alternative. Finally, we demonstrate our algorithms on both synthetic and real world bin picking tasks.Comment: 14 pages (including the cover page), 5 Figures, Technical Report, OSARO In

    PALMER: Perception-Action Loop with Memory for Long-Horizon Planning

    Full text link
    To achieve autonomy in a priori unknown real-world scenarios, agents should be able to: i) act from high-dimensional sensory observations (e.g., images), ii) learn from past experience to adapt and improve, and iii) be capable of long horizon planning. Classical planning algorithms (e.g. PRM, RRT) are proficient at handling long-horizon planning. Deep learning based methods in turn can provide the necessary representations to address the others, by modeling statistical contingencies between observations. In this direction, we introduce a general-purpose planning algorithm called PALMER that combines classical sampling-based planning algorithms with learning-based perceptual representations. For training these perceptual representations, we combine Q-learning with contrastive representation learning to create a latent space where the distance between the embeddings of two states captures how easily an optimal policy can traverse between them. For planning with these perceptual representations, we re-purpose classical sampling-based planning algorithms to retrieve previously observed trajectory segments from a replay buffer and restitch them into approximately optimal paths that connect any given pair of start and goal states. This creates a tight feedback loop between representation learning, memory, reinforcement learning, and sampling-based planning. The end result is an experiential framework for long-horizon planning that is significantly more robust and sample efficient compared to existing methods.Comment: Website: https://palmer.epfl.c

    Approximate Predictive Control Barrier Functions using Neural Networks: A Computationally Cheap and Permissive Safety Filter

    Full text link
    A predictive control barrier function (PCBF) based safety filter allows for verifying arbitrary control inputs with respect to future constraint satisfaction. The approach relies on the solution of two optimization problems computing the minimal constraint relaxations given the current state, and then computing the minimal deviation from a proposed input such that the relaxed constraints are satisfied. This paper presents an approximation procedure that uses a neural network to approximate the optimal value function of the first optimization problem from samples, such that the computation becomes independent of the prediction horizon. It is shown that this approximation guarantees that states converge to a neighborhood of the implicitly defined safe set of the original problem, where system constraints can be satisfied for all times forward. The convergence result relies on a novel class K\mathcal{K} lower bound on the PCBF decrease and depends on the approximation error of the neural network. Lastly, we demonstrate our approach in simulation for an autonomous driving example and show that the proposed approximation leads to a significant decrease in computation time compared to the original approach.Comment: Submitted to ECC2
    • …
    corecore