100 research outputs found
Online Tool Selection with Learned Grasp Prediction Models
Deep learning-based grasp prediction models have become an industry standard
for robotic bin-picking systems. To maximize pick success, production
environments are often equipped with several end-effector tools that can be
swapped on-the-fly, based on the target object. Tool-change, however, takes
time. Choosing the order of grasps to perform, and corresponding tool-change
actions, can improve system throughput; this is the topic of our work. The main
challenge in planning tool change is uncertainty - we typically cannot see
objects in the bin that are currently occluded. Inspired by queuing and
admission control problems, we model the problem as a Markov Decision Process
(MDP), where the goal is to maximize expected throughput, and we pursue an
approximate solution based on model predictive control, where at each time step
we plan based only on the currently visible objects. Special to our method is
the idea of void zones, which are geometrical boundaries in which an unknown
object will be present, and therefore cannot be accounted for during planning.
Our planning problem can be solved using integer linear programming (ILP).
However, we find that an approximate solution based on sparse tree search
yields near optimal performance at a fraction of the time. Another question
that we explore is how to measure the performance of tool-change planning: we
find that throughput alone can fail to capture delicate and smooth behavior,
and propose a principled alternative. Finally, we demonstrate our algorithms on
both synthetic and real world bin picking tasks.Comment: 14 pages (including the cover page), 5 Figures, Technical Report,
OSARO In
PALMER: Perception-Action Loop with Memory for Long-Horizon Planning
To achieve autonomy in a priori unknown real-world scenarios, agents should
be able to: i) act from high-dimensional sensory observations (e.g., images),
ii) learn from past experience to adapt and improve, and iii) be capable of
long horizon planning. Classical planning algorithms (e.g. PRM, RRT) are
proficient at handling long-horizon planning. Deep learning based methods in
turn can provide the necessary representations to address the others, by
modeling statistical contingencies between observations. In this direction, we
introduce a general-purpose planning algorithm called PALMER that combines
classical sampling-based planning algorithms with learning-based perceptual
representations. For training these perceptual representations, we combine
Q-learning with contrastive representation learning to create a latent space
where the distance between the embeddings of two states captures how easily an
optimal policy can traverse between them. For planning with these perceptual
representations, we re-purpose classical sampling-based planning algorithms to
retrieve previously observed trajectory segments from a replay buffer and
restitch them into approximately optimal paths that connect any given pair of
start and goal states. This creates a tight feedback loop between
representation learning, memory, reinforcement learning, and sampling-based
planning. The end result is an experiential framework for long-horizon planning
that is significantly more robust and sample efficient compared to existing
methods.Comment: Website: https://palmer.epfl.c
Approximate Predictive Control Barrier Functions using Neural Networks: A Computationally Cheap and Permissive Safety Filter
A predictive control barrier function (PCBF) based safety filter allows for
verifying arbitrary control inputs with respect to future constraint
satisfaction. The approach relies on the solution of two optimization problems
computing the minimal constraint relaxations given the current state, and then
computing the minimal deviation from a proposed input such that the relaxed
constraints are satisfied. This paper presents an approximation procedure that
uses a neural network to approximate the optimal value function of the first
optimization problem from samples, such that the computation becomes
independent of the prediction horizon. It is shown that this approximation
guarantees that states converge to a neighborhood of the implicitly defined
safe set of the original problem, where system constraints can be satisfied for
all times forward. The convergence result relies on a novel class
lower bound on the PCBF decrease and depends on the approximation error of the
neural network. Lastly, we demonstrate our approach in simulation for an
autonomous driving example and show that the proposed approximation leads to a
significant decrease in computation time compared to the original approach.Comment: Submitted to ECC2
- …