333 research outputs found
Guided data augmentation for improved semi-supervised image classification in low data regime.
Deep learning models have achieved state of the art performances, especially for computer vision applications. Much of the recent successes can be attributed to the existence of large, high quality, labeled datasets. However, in many real-world applications, collecting similar datasets is often cumbersome and time consuming. For instance, developing robust automatic target recognition models from infrared images still faces major challenges. This is mainly due to the difficulty of acquiring high resolution inputs, sensitivity to the thermal sensors\u27 calibration, meteorological conditions, targets\u27 scale and viewpoint invariance. Ideally, a good training set should contain enough variations within each class for the model to learn the most optimal decision boundaries. However, when there are under-represented regions in the training feature space, especially in low data regime or in presence of low-quality inputs, the model risks learning sub-optimal decision boundaries, resulting in sub-optimal predictions. This dissertation presents novel data augmentation (DA) strategies aimed at improving the performance of machine learning models in low data regimes. The proposed techniques are designed to augment limited labeled datasets, providing the models with additional information to learn from.\\ The first contribution of this work is the development of Confidence-Guided Generative Augmentation (CGG-DA), a technique that trains and learns a generative model, such as Variational Autoencoder (VAE) and Deep Convolutional Generative Adversarial Networks (DCGAN), to generate synthetic augmentations. These generative models can generate labeled and/or unlabeled data by drawing from the same distribution as the under-performing samples based on a baseline reference model. By augmenting the training dataset with these synthetic images, CGG-DA aims to bridge the performance gap across different regions of the training feature space. We also introduce a Tool-Supported Contextual Augmentation (TSC-DA) technique that leverages existing ML models, such as classifiers or object detectors, to label available unlabeled data. Samples with consistent and high confidence predictions are used as labeled augmentations. On the other hand, samples with low confidence predictions might still contain some information even though they are more likely to be noisy and inconsistent. Hence, we keep them and use them as unlabeled samples during. Our third proposed DA explores the use of existing ML tools and external image repositories for data augmentation. This approach, called Guided External Data Augmentation (EG-DA), leverages external image repositories to augment the available dataset. External repositories are typically noisy, and might include a lot of out-of-distribution (OOD) samples. If included in the training process without proper handling, OOD samples can confuse the model and degrade the performance. To tackle this issue, we design and train a VAE-based anomaly detection component and use it to filter out any OOD samples. Since our DA includes both labeled data and a larger set of unlabeled data, we use semi-supervised training to exploit the information contained in the generated augmentations. This can guide the network to learn complex representations, and generalize to new data. The proposed data augmentation techniques are evaluated on two computer vision applications, and using multiple scenarios. We also compare our approach, using benchmark datasets, to baseline models trained on the initial labeled data only, and to existing data augmentation techniques. We show that each proposed augmentation consistently improve the results. We also perform an in-depth analysis to justify the observed improvements
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Sonar image interpretation for sub-sea operations
Mine Counter-Measure (MCM) missions are conducted to neutralise underwater
explosives. Automatic Target Recognition (ATR) assists operators by
increasing the speed and accuracy of data review. ATR embedded on vehicles
enables adaptive missions which increase the speed of data acquisition. This
thesis addresses three challenges; the speed of data processing, robustness of
ATR to environmental conditions and the large quantities of data required to
train an algorithm.
The main contribution of this thesis is a novel ATR algorithm. The algorithm
uses features derived from the projection of 3D boxes to produce a set of 2D
templates. The template responses are independent of grazing angle, range
and target orientation. Integer skewed integral images, are derived to accelerate
the calculation of the template responses. The algorithm is compared
to the Haar cascade algorithm. For a single model of sonar and cylindrical
targets the algorithm reduces the Probability of False Alarm (PFA) by 80%
at a Probability of Detection (PD) of 85%. The algorithm is trained on target
data from another model of sonar. The PD is only 6% lower even though no
representative target data was used for training.
The second major contribution is an adaptive ATR algorithm that uses local
sea-floor characteristics to address the problem of ATR robustness with
respect to the local environment. A dual-tree wavelet decomposition of the
sea-floor and an Markov Random Field (MRF) based graph-cut algorithm is
used to segment the terrain. A Neural Network (NN) is then trained to filter
ATR results based on the local sea-floor context. It is shown, for the Haar
Cascade algorithm, that the PFA can be reduced by 70% at a PD of 85%.
Speed of data processing is addressed using novel pre-processing techniques.
The standard three class MRF, for sonar image segmentation, is formulated
using graph-cuts. Consequently, a 1.2 million pixel image is segmented in
1.2 seconds. Additionally, local estimation of class models is introduced to
remove range dependent segmentation quality. Finally, an A* graph search
is developed to remove the surface return, a line of saturated pixels often
detected as false alarms by ATR. The A* search identifies the surface return
in 199 of 220 images tested with a runtime of 2.1 seconds. The algorithm is
robust to the presence of ripples and rocks
- …