24 research outputs found
AutoSimulate: (Quickly) Learning Synthetic Data Generation
Simulation is increasingly being used for generating large labelled datasets
in many machine learning problems. Recent methods have focused on adjusting
simulator parameters with the goal of maximising accuracy on a validation task,
usually relying on REINFORCE-like gradient estimators. However these approaches
are very expensive as they treat the entire data generation, model training,
and validation pipeline as a black-box and require multiple costly objective
evaluations at each iteration. We propose an efficient alternative for optimal
synthetic data generation, based on a novel differentiable approximation of the
objective. This allows us to optimize the simulator, which may be
non-differentiable, requiring only one objective evaluation at each iteration
with a little overhead. We demonstrate on a state-of-the-art photorealistic
renderer that the proposed method finds the optimal data distribution faster
(up to ), with significantly reduced training data generation (up to
) and better accuracy () on real-world test datasets than
previous methods.Comment: ECCV 202
On the Importance of Visual Context for Data Augmentation in Scene Understanding
Performing data augmentation for learning deep neural networks is known to be
important for training visual recognition systems. By artificially increasing
the number of training examples, it helps reducing overfitting and improves
generalization. While simple image transformations can already improve
predictive performance in most vision tasks, larger gains can be obtained by
leveraging task-specific prior knowledge. In this work, we consider object
detection, semantic and instance segmentation and augment the training images
by blending objects in existing scenes, using instance segmentation
annotations. We observe that randomly pasting objects on images hurts the
performance, unless the object is placed in the right context. To resolve this
issue, we propose an explicit context model by using a convolutional neural
network, which predicts whether an image region is suitable for placing a given
object or not. In our experiments, we show that our approach is able to improve
object detection, semantic and instance segmentation on the PASCAL VOC12 and
COCO datasets, with significant gains in a limited annotation scenario, i.e.
when only one category is annotated. We also show that the method is not
limited to datasets that come with expensive pixel-wise instance annotations
and can be used when only bounding boxes are available, by employing
weakly-supervised learning for instance masks approximation.Comment: Updated the experimental section. arXiv admin note: substantial text
overlap with arXiv:1807.0742
Deep Nuisance Disentanglement for Robust Object Detection from Unmanned Aerial Vehicles
Object detection from images captured by Unmanned Aerial Vehicles (UAVs) is becoming dramatically useful. Despite the great success of the generic object detection methods trained on ground-to-ground images, a huge performance drop is observed when these methods are directly applied to images captured by UAVs. The unsatisfactory performance is owing to many UAV-specific nuisances, such as varying flying altitudes, adverse weather conditions, dynamically changing viewing angles, etc., constituting a large number of fine-grained domains across which the detection model has to stay robust. Fortunately, UAVs record meta-data corresponding to the same varying attributes, which can either be freely available along with the UAV images, or easily obtained. We propose to utilize the free meta-data in conjunction with the associated UAV images to learn domain-robust features via an adversarial training framework. This model is dubbed Nuisance Disentangled Feature Transforms (NDFT), for the specific challenging problem of object detection in UAV images. It achieves a substantial gain in robustness to these nuisances. This work demonstrates the effectiveness of our proposed algorithm by showing both quantitative improvements on two existing UAV-based object detection benchmarks, as well as qualitative improvements on self-collected UAV imagery.
Reprinted with permission from the Abstract section of Deep Nuisance Disentanglement for Robust Object Detection from Unmanned Aerial Vehicles by Zhenyu Wu†, Karthik Suresh†, Priya Narayanan, Hongyu Xu, Heesung Kwon, Zhangyang Wang, 2019, International Conference on Computer Vision (ICCV 2019) Proceedings (Under Review). †indicates equal contributio