2,044 research outputs found
The Event-Camera Dataset and Simulator: Event-based Data for Pose Estimation, Visual Odometry, and SLAM
New vision sensors, such as the Dynamic and Active-pixel Vision sensor
(DAVIS), incorporate a conventional global-shutter camera and an event-based
sensor in the same pixel array. These sensors have great potential for
high-speed robotics and computer vision because they allow us to combine the
benefits of conventional cameras with those of event-based sensors: low
latency, high temporal resolution, and very high dynamic range. However, new
algorithms are required to exploit the sensor characteristics and cope with its
unconventional output, which consists of a stream of asynchronous brightness
changes (called "events") and synchronous grayscale frames. For this purpose,
we present and release a collection of datasets captured with a DAVIS in a
variety of synthetic and real environments, which we hope will motivate
research on new algorithms for high-speed and high-dynamic-range robotics and
computer-vision applications. In addition to global-shutter intensity images
and asynchronous events, we provide inertial measurements and ground-truth
camera poses from a motion-capture system. The latter allows comparing the pose
accuracy of ego-motion estimation algorithms quantitatively. All the data are
released both as standard text files and binary files (i.e., rosbag). This
paper provides an overview of the available data and describes a simulator that
we release open-source to create synthetic event-camera data.Comment: 7 pages, 4 figures, 3 table
HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object Pose Dataset with Household Objects in Realistic Scenarios
Estimating the 6D pose of objects is a major 3D computer vision problem.
Since the promising outcomes from instance-level approaches, research heads
also move towards category-level pose estimation for more practical application
scenarios. However, unlike well-established instance-level pose datasets,
available category-level datasets lack annotation quality and provided pose
quantity. We propose the new category-level 6D pose dataset HouseCat6D
featuring 1) Multi-modality of Polarimetric RGB and Depth (RGBD+P), 2) Highly
diverse 194 objects of 10 household object categories including 2
photometrically challenging categories, 3) High-quality pose annotation with an
error range of only 1.35 mm to 1.74 mm, 4) 41 large-scale scenes with extensive
viewpoint coverage and occlusions, 5) Checkerboard-free environment throughout
the entire scene, and 6) Additionally annotated dense 6D parallel-jaw grasps.
Furthermore, we also provide benchmark results of state-of-the-art
category-level pose estimation networks
Two-Stage Transfer Learning for Heterogeneous Robot Detection and 3D Joint Position Estimation in a 2D Camera Image using CNN
Collaborative robots are becoming more common on factory floors as well as
regular environments, however, their safety still is not a fully solved issue.
Collision detection does not always perform as expected and collision avoidance
is still an active research area. Collision avoidance works well for fixed
robot-camera setups, however, if they are shifted around, Eye-to-Hand
calibration becomes invalid making it difficult to accurately run many of the
existing collision avoidance algorithms. We approach the problem by presenting
a stand-alone system capable of detecting the robot and estimating its
position, including individual joints, by using a simple 2D colour image as an
input, where no Eye-to-Hand calibration is needed. As an extension of previous
work, a two-stage transfer learning approach is used to re-train a
multi-objective convolutional neural network (CNN) to allow it to be used with
heterogeneous robot arms. Our method is capable of detecting the robot in
real-time and new robot types can be added by having significantly smaller
training datasets compared to the requirements of a fully trained network. We
present data collection approach, the structure of the multi-objective CNN, the
two-stage transfer learning training and test results by using real robots from
Universal Robots, Kuka, and Franka Emika. Eventually, we analyse possible
application areas of our method together with the possible improvements.Comment: 6+n pages, ICRA 2019 submissio
Multi-Modal Dataset Acquisition for Photometrically Challenging Object
This paper addresses the limitations of current datasets for 3D vision tasks
in terms of accuracy, size, realism, and suitable imaging modalities for
photometrically challenging objects. We propose a novel annotation and
acquisition pipeline that enhances existing 3D perception and 6D object pose
datasets. Our approach integrates robotic forward-kinematics, external infrared
trackers, and improved calibration and annotation procedures. We present a
multi-modal sensor rig, mounted on a robotic end-effector, and demonstrate how
it is integrated into the creation of highly accurate datasets. Additionally,
we introduce a freehand procedure for wider viewpoint coverage. Both approaches
yield high-quality 3D data with accurate object and camera pose annotations.
Our methods overcome the limitations of existing datasets and provide valuable
resources for 3D vision research.Comment: Accepted at ICCV 2023 TRICKY Worksho
Extrinsic Parameter Calibration for Line Scanning Cameras on Ground Vehicles with Navigation Systems Using a Calibration Pattern
Line scanning cameras, which capture only a single line of pixels, have been
increasingly used in ground based mobile or robotic platforms. In applications
where it is advantageous to directly georeference the camera data to world
coordinates, an accurate estimate of the camera's 6D pose is required. This
paper focuses on the common case where a mobile platform is equipped with a
rigidly mounted line scanning camera, whose pose is unknown, and a navigation
system providing vehicle body pose estimates. We propose a novel method that
estimates the camera's pose relative to the navigation system. The approach
involves imaging and manually labelling a calibration pattern with distinctly
identifiable points, triangulating these points from camera and navigation
system data and reprojecting them in order to compute a likelihood, which is
maximised to estimate the 6D camera pose. Additionally, a Markov Chain Monte
Carlo (MCMC) algorithm is used to estimate the uncertainty of the offset.
Tested on two different platforms, the method was able to estimate the pose to
within 0.06 m / 1.05 and 0.18 m / 2.39. We also propose
several approaches to displaying and interpreting the 6D results in a human
readable way.Comment: Published in MDPI Sensors, 30 October 201
Combining Differential Kinematics and Optical Flow for Automatic Labeling of Continuum Robots in Minimally Invasive Surgery
International audienceThe segmentation of continuum robots in medical images can be of interest for analyzing surgical procedures or for controlling them. However, the automatic segmentation of continuous and flexible shapes is not an easy task. On one hand conventional approaches are not adapted to the specificities of these instruments, such as imprecise kinematic models, and on the other hand techniques based on deep-learning showed interesting capabilities but need many manually labeled images. In this article we propose a novel approach for segmenting continuum robots on endoscopic images, which requires no prior on the instrument visual appearance and no manual annotation of images. The method relies on the use of the combination of kinematic models and differential kinematic models of the robot and the analysis of optical flow in the images. A cost function aggregating information from the acquired image, from optical flow and from robot encoders is optimized using particle swarm optimization and provides estimated parameters of the pose of the continuum instrument and a mask defining the instrument in the image. In addition a temporal consistency is assessed in order to improve stochastic optimization and reject outliers. The proposed approach has been tested for the robotic instruments of a flexible endoscopy platform both for benchtop acquisitions and an in vivo video. The results show the ability of the technique to correctly segment the instruments without a prior, and in challenging conditions. The obtained segmentation can be used for several applications, for instance for providing automatic labels for machine learning techniques
Fast and Reliable Autonomous Surgical Debridement with Cable-Driven Robots Using a Two-Phase Calibration Procedure
Automating precision subtasks such as debridement (removing dead or diseased
tissue fragments) with Robotic Surgical Assistants (RSAs) such as the da Vinci
Research Kit (dVRK) is challenging due to inherent non-linearities in
cable-driven systems. We propose and evaluate a novel two-phase coarse-to-fine
calibration method. In Phase I (coarse), we place a red calibration marker on
the end effector and let it randomly move through a set of open-loop
trajectories to obtain a large sample set of camera pixels and internal robot
end-effector configurations. This coarse data is then used to train a Deep
Neural Network (DNN) to learn the coarse transformation bias. In Phase II
(fine), the bias from Phase I is applied to move the end-effector toward a
small set of specific target points on a printed sheet. For each target, a
human operator manually adjusts the end-effector position by direct contact
(not through teleoperation) and the residual compensation bias is recorded.
This fine data is then used to train a Random Forest (RF) to learn the fine
transformation bias. Subsequent experiments suggest that without calibration,
position errors average 4.55mm. Phase I can reduce average error to 2.14mm and
the combination of Phase I and Phase II can reduces average error to 1.08mm. We
apply these results to debridement of raisins and pumpkin seeds as fragment
phantoms. Using an endoscopic stereo camera with standard edge detection,
experiments with 120 trials achieved average success rates of 94.5%, exceeding
prior results with much larger fragments (89.4%) and achieving a speedup of
2.1x, decreasing time per fragment from 15.8 seconds to 7.3 seconds. Source
code, data, and videos are available at
https://sites.google.com/view/calib-icra/.Comment: Code, data, and videos are available at
https://sites.google.com/view/calib-icra/. Final version for ICRA 201
- …