6,184 research outputs found
Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
Instrumenting and collecting annotated visual grasping datasets to train
modern machine learning algorithms can be extremely time-consuming and
expensive. An appealing alternative is to use off-the-shelf simulators to
render synthetic data for which ground-truth annotations are generated
automatically. Unfortunately, models trained purely on simulated data often
fail to generalize to the real world. We study how randomized simulated
environments and domain adaptation methods can be extended to train a grasping
system to grasp novel objects from raw monocular RGB images. We extensively
evaluate our approaches with a total of more than 25,000 physical test grasps,
studying a range of simulation conditions and domain adaptation methods,
including a novel extension of pixel-level domain adaptation that we term the
GraspGAN. We show that, by using synthetic data and domain adaptation, we are
able to reduce the number of real-world samples needed to achieve a given level
of performance by up to 50 times, using only randomly generated simulated
objects. We also show that by using only unlabeled real-world data and our
GraspGAN methodology, we obtain real-world grasping performance without any
real-world labels that is similar to that achieved with 939,777 labeled
real-world samples.Comment: 9 pages, 5 figures, 3 table
Adversarial training with cycle consistency for unsupervised super-resolution in endomicroscopy
In recent years, endomicroscopy has become increasingly used for diagnostic
purposes and interventional guidance. It can provide intraoperative aids for
real-time tissue characterization and can help to perform visual investigations
aimed for example to discover epithelial cancers. Due to physical constraints
on the acquisition process, endomicroscopy images, still today have a low
number of informative pixels which hampers their quality. Post-processing
techniques, such as Super-Resolution (SR), are a potential solution to increase
the quality of these images. SR techniques are often supervised, requiring
aligned pairs of low-resolution (LR) and high-resolution (HR) images patches to
train a model. However, in our domain, the lack of HR images hinders the
collection of such pairs and makes supervised training unsuitable. For this
reason, we propose an unsupervised SR framework based on an adversarial deep
neural network with a physically-inspired cycle consistency, designed to impose
some acquisition properties on the super-resolved images. Our framework can
exploit HR images, regardless of the domain where they are coming from, to
transfer the quality of the HR images to the initial LR images. This property
can be particularly useful in all situations where pairs of LR/HR are not
available during the training. Our quantitative analysis, validated using a
database of 238 endomicroscopy video sequences from 143 patients, shows the
ability of the pipeline to produce convincing super-resolved images. A Mean
Opinion Score (MOS) study also confirms this quantitative image quality
assessment.Comment: Accepted for publication on Medical Image Analysis journa
Adversarial Deformation Regularization for Training Image Registration Neural Networks
We describe an adversarial learning approach to constrain convolutional
neural network training for image registration, replacing heuristic smoothness
measures of displacement fields often used in these tasks. Using
minimally-invasive prostate cancer intervention as an example application, we
demonstrate the feasibility of utilizing biomechanical simulations to
regularize a weakly-supervised anatomical-label-driven registration network for
aligning pre-procedural magnetic resonance (MR) and 3D intra-procedural
transrectal ultrasound (TRUS) images. A discriminator network is optimized to
distinguish the registration-predicted displacement fields from the motion data
simulated by finite element analysis. During training, the registration network
simultaneously aims to maximize similarity between anatomical labels that
drives image alignment and to minimize an adversarial generator loss that
measures divergence between the predicted- and simulated deformation. The
end-to-end trained network enables efficient and fully-automated registration
that only requires an MR and TRUS image pair as input, without anatomical
labels or simulated data during inference. 108 pairs of labelled MR and TRUS
images from 76 prostate cancer patients and 71,500 nonlinear finite-element
simulations from 143 different patients were used for this study. We show that,
with only gland segmentation as training labels, the proposed method can help
predict physically plausible deformation without any other smoothness penalty.
Based on cross-validation experiments using 834 pairs of independent validation
landmarks, the proposed adversarial-regularized registration achieved a target
registration error of 6.3 mm that is significantly lower than those from
several other regularization methods.Comment: Accepted to MICCAI 201
Addressing Appearance Change in Outdoor Robotics with Adversarial Domain Adaptation
Appearance changes due to weather and seasonal conditions represent a strong
impediment to the robust implementation of machine learning systems in outdoor
robotics. While supervised learning optimises a model for the training domain,
it will deliver degraded performance in application domains that underlie
distributional shifts caused by these changes. Traditionally, this problem has
been addressed via the collection of labelled data in multiple domains or by
imposing priors on the type of shift between both domains. We frame the problem
in the context of unsupervised domain adaptation and develop a framework for
applying adversarial techniques to adapt popular, state-of-the-art network
architectures with the additional objective to align features across domains.
Moreover, as adversarial training is notoriously unstable, we first perform an
extensive ablation study, adapting many techniques known to stabilise
generative adversarial networks, and evaluate on a surrogate classification
task with the same appearance change. The distilled insights are applied to the
problem of free-space segmentation for motion planning in autonomous driving.Comment: In Proceedings of the 2017 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS 2017
Adversarially Tuned Scene Generation
Generalization performance of trained computer vision systems that use
computer graphics (CG) generated data is not yet effective due to the concept
of 'domain-shift' between virtual and real data. Although simulated data
augmented with a few real world samples has been shown to mitigate domain shift
and improve transferability of trained models, guiding or bootstrapping the
virtual data generation with the distributions learnt from target real world
domain is desired, especially in the fields where annotating even few real
images is laborious (such as semantic labeling, and intrinsic images etc.). In
order to address this problem in an unsupervised manner, our work combines
recent advances in CG (which aims to generate stochastic scene layouts coupled
with large collections of 3D object models) and generative adversarial training
(which aims train generative models by measuring discrepancy between generated
and real data in terms of their separability in the space of a deep
discriminatively-trained classifier). Our method uses iterative estimation of
the posterior density of prior distributions for a generative graphical model.
This is done within a rejection sampling framework. Initially, we assume
uniform distributions as priors on the parameters of a scene described by a
generative graphical model. As iterations proceed the prior distributions get
updated to distributions that are closer to the (unknown) distributions of
target data. We demonstrate the utility of adversarially tuned scene generation
on two real-world benchmark datasets (CityScapes and CamVid) for traffic scene
semantic labeling with a deep convolutional net (DeepLab). We realized
performance improvements by 2.28 and 3.14 points (using the IoU metric) between
the DeepLab models trained on simulated sets prepared from the scene generation
models before and after tuning to CityScapes and CamVid respectively.Comment: 9 pages, accepted at CVPR 201
- …