160,217 research outputs found
EARL: Eye-on-Hand Reinforcement Learner for Dynamic Grasping with Active Pose Estimation
In this paper, we explore the dynamic grasping of moving objects through
active pose tracking and reinforcement learning for hand-eye coordination
systems. Most existing vision-based robotic grasping methods implicitly assume
target objects are stationary or moving predictably. Performing grasping of
unpredictably moving objects presents a unique set of challenges. For example,
a pre-computed robust grasp can become unreachable or unstable as the target
object moves, and motion planning must also be adaptive. In this work, we
present a new approach, Eye-on-hAnd Reinforcement Learner (EARL), for enabling
coupled Eye-on-Hand (EoH) robotic manipulation systems to perform real-time
active pose tracking and dynamic grasping of novel objects without explicit
motion prediction. EARL readily addresses many thorny issues in automated
hand-eye coordination, including fast-tracking of 6D object pose from vision,
learning control policy for a robotic arm to track a moving object while
keeping the object in the camera's field of view, and performing dynamic
grasping. We demonstrate the effectiveness of our approach in extensive
experiments validated on multiple commercial robotic arms in both simulations
and complex real-world tasks.Comment: Presented on IROS 2023 Corresponding author Siddarth Jai
INTELLIGENT VISION-BASED NAVIGATION SYSTEM
This thesis presents a complete vision-based navigation system that can plan and
follow an obstacle-avoiding path to a desired destination on the basis of an internal map
updated with information gathered from its visual sensor.
For vision-based self-localization, the system uses new floor-edges-specific filters
for detecting floor edges and their pose, a new algorithm for determining the orientation of
the robot, and a new procedure for selecting the initial positions in the self-localization
procedure. Self-localization is based on matching visually detected features with those
stored in a prior map.
For planning, the system demonstrates for the first time a real-world application of
the neural-resistive grid method to robot navigation. The neural-resistive grid is modified
with a new connectivity scheme that allows the representation of the collision-free space of
a robot with finite dimensions via divergent connections between the spatial memory layer
and the neuro-resistive grid layer.
A new control system is proposed. It uses a Smith Predictor architecture that has
been modified for navigation applications and for intermittent delayed feedback typical of
artificial vision. A receding horizon control strategy is implemented using Normalised
Radial Basis Function nets as path encoders, to ensure continuous motion during the delay
between measurements.
The system is tested in a simplified environment where an obstacle placed
anywhere is detected visually and is integrated in the path planning process.
The results show the validity of the control concept and the crucial importance of a
robust vision-based self-localization process
Learning Pose Estimation for UAV Autonomous Navigation and Landing Using Visual-Inertial Sensor Data
In this work, we propose a robust network-in-the-loop control system for autonomous navigation and landing of an Unmanned-Aerial-Vehicle (UAV). To estimate the UAV’s absolute pose, we develop a deep neural network (DNN) architecture for visual-inertial odometry, which provides a robust alternative to traditional methods. We first evaluate the accuracy of the estimation by comparing the prediction of our model to traditional visual-inertial approaches on the publicly available EuRoC MAV dataset. The results indicate a clear improvement in the accuracy of the pose estimation up to 25% over the baseline. Finally, we integrate the data-driven estimator in the closed-loop flight control system of Airsim, a simulator available as a plugin for Unreal Engine, and we provide simulation results for autonomous navigation and landing
VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera
We present the first real-time method to capture the full global 3D skeletal
pose of a human in a stable, temporally consistent manner using a single RGB
camera. Our method combines a new convolutional neural network (CNN) based pose
regressor with kinematic skeleton fitting. Our novel fully-convolutional pose
formulation regresses 2D and 3D joint positions jointly in real time and does
not require tightly cropped input frames. A real-time kinematic skeleton
fitting method uses the CNN output to yield temporally stable 3D global pose
reconstructions on the basis of a coherent kinematic skeleton. This makes our
approach the first monocular RGB method usable in real-time applications such
as 3D character control---thus far, the only monocular methods for such
applications employed specialized RGB-D cameras. Our method's accuracy is
quantitatively on par with the best offline 3D monocular RGB pose estimation
methods. Our results are qualitatively comparable to, and sometimes better
than, results from monocular RGB-D approaches, such as the Kinect. However, we
show that our approach is more broadly applicable than RGB-D solutions, i.e. it
works for outdoor scenes, community videos, and low quality commodity RGB
cameras.Comment: Accepted to SIGGRAPH 201
Visual Servoing from Deep Neural Networks
We present a deep neural network-based method to perform high-precision,
robust and real-time 6 DOF visual servoing. The paper describes how to create a
dataset simulating various perturbations (occlusions and lighting conditions)
from a single real-world image of the scene. A convolutional neural network is
fine-tuned using this dataset to estimate the relative pose between two images
of the same scene. The output of the network is then employed in a visual
servoing control scheme. The method converges robustly even in difficult
real-world settings with strong lighting variations and occlusions.A
positioning error of less than one millimeter is obtained in experiments with a
6 DOF robot.Comment: fixed authors lis
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
- …