593 research outputs found

    Visual servoing of an autonomous helicopter in urban areas using feature tracking

    Get PDF
    We present the design and implementation of a vision-based feature tracking system for an autonomous helicopter. Visual sensing is used for estimating the position and velocity of features in the image plane (urban features like windows) in order to generate velocity references for the flight control. These visual-based references are then combined with GPS-positioning references to navigate towards these features and then track them. We present results from experimental flight trials, performed in two UAV systems and under different conditions that show the feasibility and robustness of our approach

    Compositional Servoing by Recombining Demonstrations

    Full text link
    Learning-based manipulation policies from image inputs often show weak task transfer capabilities. In contrast, visual servoing methods allow efficient task transfer in high-precision scenarios while requiring only a few demonstrations. In this work, we present a framework that formulates the visual servoing task as graph traversal. Our method not only extends the robustness of visual servoing, but also enables multitask capability based on a few task-specific demonstrations. We construct demonstration graphs by splitting existing demonstrations and recombining them. In order to traverse the demonstration graph in the inference case, we utilize a similarity function that helps select the best demonstration for a specific task. This enables us to compute the shortest path through the graph. Ultimately, we show that recombining demonstrations leads to higher task-respective success. We present extensive simulation and real-world experimental results that demonstrate the efficacy of our approach.Comment: http://compservo.cs.uni-freiburg.d

    CNS: Correspondence Encoded Neural Image Servo Policy

    Full text link
    Image servo is an indispensable technique in robotic applications that helps to achieve high precision positioning. The intermediate representation of image servo policy is important to sensor input abstraction and policy output guidance. Classical approaches achieve high precision but require clean keypoint correspondence, and suffer from limited convergence basin or weak feature error robustness. Recent learning-based methods achieve moderate precision and large convergence basin on specific scenes but face issues when generalizing to novel environments. In this paper, we encode keypoints and correspondence into a graph and use graph neural network as architecture of controller. This design utilizes both advantages: generalizable intermediate representation from keypoint correspondence and strong modeling ability from neural network. Other techniques including realistic data generation, feature clustering and distance decoupling are proposed to further improve efficiency, precision and generalization. Experiments in simulation and real-world verify the effectiveness of our method in speed (maximum 40fps along with observer), precision (<0.3{\deg} and sub-millimeter accuracy) and generalization (sim-to-real without fine-tuning). Project homepage (full paper with supplementary text, video and code): https://hhcaz.github.io/CNS-hom

    Bridging Low-level Geometry to High-level Concepts in Visual Servoing of Robot Manipulation Task Using Event Knowledge Graphs and Vision-Language Models

    Full text link
    In this paper, we propose a framework of building knowledgeable robot control in the scope of smart human-robot interaction, by empowering a basic uncalibrated visual servoing controller with contextual knowledge through the joint usage of event knowledge graphs (EKGs) and large-scale pretrained vision-language models (VLMs). The framework is expanded in twofold: first, we interpret low-level image geometry as high-level concepts, allowing us to prompt VLMs and to select geometric features of points and lines for motor control skills; then, we create an event knowledge graph (EKG) to conceptualize a robot manipulation task of interest, where the main body of the EKG is characterized by an executable behavior tree, and the leaves by semantic concepts relevant to the manipulation context. We demonstrate, in an uncalibrated environment with real robot trials, that our method lowers the reliance of human annotation during task interfacing, allows the robot to perform activities of daily living more easily by treating low-level geometric-based motor control skills as high-level concepts, and is beneficial in building cognitive thinking for smart robot applications

    Analysis and Observations from the First Amazon Picking Challenge

    Full text link
    This paper presents a overview of the inaugural Amazon Picking Challenge along with a summary of a survey conducted among the 26 participating teams. The challenge goal was to design an autonomous robot to pick items from a warehouse shelf. This task is currently performed by human workers, and there is hope that robots can someday help increase efficiency and throughput while lowering cost. We report on a 28-question survey posed to the teams to learn about each team's background, mechanism design, perception apparatus, planning and control approach. We identify trends in this data, correlate it with each team's success in the competition, and discuss observations and lessons learned based on survey results and the authors' personal experiences during the challenge

    DeformerNet: Learning Bimanual Manipulation of 3D Deformable Objects

    Full text link
    Applications in fields ranging from home care to warehouse fulfillment to surgical assistance require robots to reliably manipulate the shape of 3D deformable objects. Analytic models of elastic, 3D deformable objects require numerous parameters to describe the potentially infinite degrees of freedom present in determining the object's shape. Previous attempts at performing 3D shape control rely on hand-crafted features to represent the object shape and require training of object-specific control models. We overcome these issues through the use of our novel DeformerNet neural network architecture, which operates on a partial-view point cloud of the manipulated object and a point cloud of the goal shape to learn a low-dimensional representation of the object shape. This shape embedding enables the robot to learn a visual servo controller that computes the desired robot end-effector action to iteratively deform the object toward the target shape. We demonstrate both in simulation and on a physical robot that DeformerNet reliably generalizes to object shapes and material stiffness not seen during training. Crucially, using DeformerNet, the robot successfully accomplishes three surgical sub-tasks: retraction (moving tissue aside to access a site underneath it), tissue wrapping (a sub-task in procedures like aortic stent placements), and connecting two tubular pieces of tissue (a sub-task in anastomosis).Comment: Submitted to IEEE Transactions on Robotics (T-RO). 18 pages, 25 figures. arXiv admin note: substantial text overlap with arXiv:2110.0468
    corecore