593 research outputs found
Visual servoing of an autonomous helicopter in urban areas using feature tracking
We present the design and implementation of a vision-based feature tracking system for an autonomous helicopter. Visual sensing is used for estimating the position and velocity of features in the image plane (urban features like windows) in order to generate velocity references for the flight control. These visual-based references are then combined with GPS-positioning references to navigate towards these features and then track them. We present results from experimental flight trials, performed in two UAV systems and under different conditions that show the feasibility and robustness of our approach
Compositional Servoing by Recombining Demonstrations
Learning-based manipulation policies from image inputs often show weak task
transfer capabilities. In contrast, visual servoing methods allow efficient
task transfer in high-precision scenarios while requiring only a few
demonstrations. In this work, we present a framework that formulates the visual
servoing task as graph traversal. Our method not only extends the robustness of
visual servoing, but also enables multitask capability based on a few
task-specific demonstrations. We construct demonstration graphs by splitting
existing demonstrations and recombining them. In order to traverse the
demonstration graph in the inference case, we utilize a similarity function
that helps select the best demonstration for a specific task. This enables us
to compute the shortest path through the graph. Ultimately, we show that
recombining demonstrations leads to higher task-respective success. We present
extensive simulation and real-world experimental results that demonstrate the
efficacy of our approach.Comment: http://compservo.cs.uni-freiburg.d
CNS: Correspondence Encoded Neural Image Servo Policy
Image servo is an indispensable technique in robotic applications that helps
to achieve high precision positioning. The intermediate representation of image
servo policy is important to sensor input abstraction and policy output
guidance. Classical approaches achieve high precision but require clean
keypoint correspondence, and suffer from limited convergence basin or weak
feature error robustness. Recent learning-based methods achieve moderate
precision and large convergence basin on specific scenes but face issues when
generalizing to novel environments. In this paper, we encode keypoints and
correspondence into a graph and use graph neural network as architecture of
controller. This design utilizes both advantages: generalizable intermediate
representation from keypoint correspondence and strong modeling ability from
neural network. Other techniques including realistic data generation, feature
clustering and distance decoupling are proposed to further improve efficiency,
precision and generalization. Experiments in simulation and real-world verify
the effectiveness of our method in speed (maximum 40fps along with observer),
precision (<0.3{\deg} and sub-millimeter accuracy) and generalization
(sim-to-real without fine-tuning). Project homepage (full paper with
supplementary text, video and code): https://hhcaz.github.io/CNS-hom
Bridging Low-level Geometry to High-level Concepts in Visual Servoing of Robot Manipulation Task Using Event Knowledge Graphs and Vision-Language Models
In this paper, we propose a framework of building knowledgeable robot control
in the scope of smart human-robot interaction, by empowering a basic
uncalibrated visual servoing controller with contextual knowledge through the
joint usage of event knowledge graphs (EKGs) and large-scale pretrained
vision-language models (VLMs). The framework is expanded in twofold: first, we
interpret low-level image geometry as high-level concepts, allowing us to
prompt VLMs and to select geometric features of points and lines for motor
control skills; then, we create an event knowledge graph (EKG) to conceptualize
a robot manipulation task of interest, where the main body of the EKG is
characterized by an executable behavior tree, and the leaves by semantic
concepts relevant to the manipulation context. We demonstrate, in an
uncalibrated environment with real robot trials, that our method lowers the
reliance of human annotation during task interfacing, allows the robot to
perform activities of daily living more easily by treating low-level
geometric-based motor control skills as high-level concepts, and is beneficial
in building cognitive thinking for smart robot applications
Analysis and Observations from the First Amazon Picking Challenge
This paper presents a overview of the inaugural Amazon Picking Challenge
along with a summary of a survey conducted among the 26 participating teams.
The challenge goal was to design an autonomous robot to pick items from a
warehouse shelf. This task is currently performed by human workers, and there
is hope that robots can someday help increase efficiency and throughput while
lowering cost. We report on a 28-question survey posed to the teams to learn
about each team's background, mechanism design, perception apparatus, planning
and control approach. We identify trends in this data, correlate it with each
team's success in the competition, and discuss observations and lessons learned
based on survey results and the authors' personal experiences during the
challenge
DeformerNet: Learning Bimanual Manipulation of 3D Deformable Objects
Applications in fields ranging from home care to warehouse fulfillment to
surgical assistance require robots to reliably manipulate the shape of 3D
deformable objects. Analytic models of elastic, 3D deformable objects require
numerous parameters to describe the potentially infinite degrees of freedom
present in determining the object's shape. Previous attempts at performing 3D
shape control rely on hand-crafted features to represent the object shape and
require training of object-specific control models. We overcome these issues
through the use of our novel DeformerNet neural network architecture, which
operates on a partial-view point cloud of the manipulated object and a point
cloud of the goal shape to learn a low-dimensional representation of the object
shape. This shape embedding enables the robot to learn a visual servo
controller that computes the desired robot end-effector action to iteratively
deform the object toward the target shape. We demonstrate both in simulation
and on a physical robot that DeformerNet reliably generalizes to object shapes
and material stiffness not seen during training. Crucially, using DeformerNet,
the robot successfully accomplishes three surgical sub-tasks: retraction
(moving tissue aside to access a site underneath it), tissue wrapping (a
sub-task in procedures like aortic stent placements), and connecting two
tubular pieces of tissue (a sub-task in anastomosis).Comment: Submitted to IEEE Transactions on Robotics (T-RO). 18 pages, 25
figures. arXiv admin note: substantial text overlap with arXiv:2110.0468
- …