333 research outputs found

    Convolutional Neural Networks - Generalizability and Interpretations

    Get PDF

    Guided data augmentation for improved semi-supervised image classification in low data regime.

    Get PDF
    Deep learning models have achieved state of the art performances, especially for computer vision applications. Much of the recent successes can be attributed to the existence of large, high quality, labeled datasets. However, in many real-world applications, collecting similar datasets is often cumbersome and time consuming. For instance, developing robust automatic target recognition models from infrared images still faces major challenges. This is mainly due to the difficulty of acquiring high resolution inputs, sensitivity to the thermal sensors\u27 calibration, meteorological conditions, targets\u27 scale and viewpoint invariance. Ideally, a good training set should contain enough variations within each class for the model to learn the most optimal decision boundaries. However, when there are under-represented regions in the training feature space, especially in low data regime or in presence of low-quality inputs, the model risks learning sub-optimal decision boundaries, resulting in sub-optimal predictions. This dissertation presents novel data augmentation (DA) strategies aimed at improving the performance of machine learning models in low data regimes. The proposed techniques are designed to augment limited labeled datasets, providing the models with additional information to learn from.\\ The first contribution of this work is the development of Confidence-Guided Generative Augmentation (CGG-DA), a technique that trains and learns a generative model, such as Variational Autoencoder (VAE) and Deep Convolutional Generative Adversarial Networks (DCGAN), to generate synthetic augmentations. These generative models can generate labeled and/or unlabeled data by drawing from the same distribution as the under-performing samples based on a baseline reference model. By augmenting the training dataset with these synthetic images, CGG-DA aims to bridge the performance gap across different regions of the training feature space. We also introduce a Tool-Supported Contextual Augmentation (TSC-DA) technique that leverages existing ML models, such as classifiers or object detectors, to label available unlabeled data. Samples with consistent and high confidence predictions are used as labeled augmentations. On the other hand, samples with low confidence predictions might still contain some information even though they are more likely to be noisy and inconsistent. Hence, we keep them and use them as unlabeled samples during. Our third proposed DA explores the use of existing ML tools and external image repositories for data augmentation. This approach, called Guided External Data Augmentation (EG-DA), leverages external image repositories to augment the available dataset. External repositories are typically noisy, and might include a lot of out-of-distribution (OOD) samples. If included in the training process without proper handling, OOD samples can confuse the model and degrade the performance. To tackle this issue, we design and train a VAE-based anomaly detection component and use it to filter out any OOD samples. Since our DA includes both labeled data and a larger set of unlabeled data, we use semi-supervised training to exploit the information contained in the generated augmentations. This can guide the network to learn complex representations, and generalize to new data. The proposed data augmentation techniques are evaluated on two computer vision applications, and using multiple scenarios. We also compare our approach, using benchmark datasets, to baseline models trained on the initial labeled data only, and to existing data augmentation techniques. We show that each proposed augmentation consistently improve the results. We also perform an in-depth analysis to justify the observed improvements

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Sonar image interpretation for sub-sea operations

    Get PDF
    Mine Counter-Measure (MCM) missions are conducted to neutralise underwater explosives. Automatic Target Recognition (ATR) assists operators by increasing the speed and accuracy of data review. ATR embedded on vehicles enables adaptive missions which increase the speed of data acquisition. This thesis addresses three challenges; the speed of data processing, robustness of ATR to environmental conditions and the large quantities of data required to train an algorithm. The main contribution of this thesis is a novel ATR algorithm. The algorithm uses features derived from the projection of 3D boxes to produce a set of 2D templates. The template responses are independent of grazing angle, range and target orientation. Integer skewed integral images, are derived to accelerate the calculation of the template responses. The algorithm is compared to the Haar cascade algorithm. For a single model of sonar and cylindrical targets the algorithm reduces the Probability of False Alarm (PFA) by 80% at a Probability of Detection (PD) of 85%. The algorithm is trained on target data from another model of sonar. The PD is only 6% lower even though no representative target data was used for training. The second major contribution is an adaptive ATR algorithm that uses local sea-floor characteristics to address the problem of ATR robustness with respect to the local environment. A dual-tree wavelet decomposition of the sea-floor and an Markov Random Field (MRF) based graph-cut algorithm is used to segment the terrain. A Neural Network (NN) is then trained to filter ATR results based on the local sea-floor context. It is shown, for the Haar Cascade algorithm, that the PFA can be reduced by 70% at a PD of 85%. Speed of data processing is addressed using novel pre-processing techniques. The standard three class MRF, for sonar image segmentation, is formulated using graph-cuts. Consequently, a 1.2 million pixel image is segmented in 1.2 seconds. Additionally, local estimation of class models is introduced to remove range dependent segmentation quality. Finally, an A* graph search is developed to remove the surface return, a line of saturated pixels often detected as false alarms by ATR. The A* search identifies the surface return in 199 of 220 images tested with a runtime of 2.1 seconds. The algorithm is robust to the presence of ripples and rocks
    • …
    corecore