4 research outputs found

    Image classification over unknown and anomalous domains

    Get PDF
    A longstanding goal in computer vision research is to develop methods that are simultaneously applicable to a broad range of prediction problems. In contrast to this, models often perform best when they are specialized to some task or data type. This thesis investigates the challenges of learning models that generalize well over multiple unknown or anomalous modes and domains in data, and presents new solutions for learning robustly in this setting. Initial investigations focus on normalization for distributions that contain multiple sources (e.g. images in different styles like cartoons or photos). Experiments demonstrate the extent to which existing modules, batch normalization in particular, struggle with such heterogeneous data, and a new solution is proposed that can better handle data from multiple visual modes, using differing sample statistics for each. While ideas to counter the overspecialization of models have been formulated in sub-disciplines of transfer learning, e.g. multi-domain and multi-task learning, these usually rely on the existence of meta information, such as task or domain labels. Relaxing this assumption gives rise to a new transfer learning setting, called latent domain learning in this thesis, in which training and inference are carried out over data from multiple visual domains, without domain-level annotations. Customized solutions are required for this, as the performance of standard models degrades: a new data augmentation technique that interpolates between latent domains in an unsupervised way is presented, alongside a dedicated module that sparsely accounts for hidden domains in data, without requiring domain labels to do so. In addition, the thesis studies the problem of classifying previously unseen or anomalous modes in data, a fundamental problem in one-class learning, and anomaly detection in particular. While recent ideas have been focused on developing self-supervised solutions for the one-class setting, in this thesis new methods based on transfer learning are formulated. Extensive experimental evidence demonstrates that a transfer-based perspective benefits new problems that have recently been proposed in anomaly detection literature, in particular challenging semantic detection tasks

    An Object-Oriented Deep Multi-Sphere Support Vector Data Description Method for Impervious Surfaces Extraction Based on Multi-Sourced Data

    No full text
    The effective extraction of impervious surfaces is critical to monitor their expansion and ensure the sustainable development of cities. Open geographic data can provide a large number of training samples for machine learning methods based on remote-sensed images to extract impervious surfaces due to their advantages of low acquisition cost and large coverage. However, training samples generated from open geographic data suffer from severe sample imbalance. Although one-class methods can effectively extract an impervious surface based on imbalanced samples, most of the current one-class methods ignore the fact that an impervious surface comprises varied geographic objects, such as roads and buildings. Therefore, this paper proposes an object-oriented deep multi-sphere support vector data description (OODMSVDD) method, which takes into account the diversity of impervious surfaces and incorporates a variety of open geographic data involving OpenStreetMap (OSM), Points of Interest (POIs), and trajectory GPS points to automatically generate massive samples for model learning, thereby improving the extraction of impervious surfaces with varied types. The feasibility of the proposed method is experimentally verified with an overall accuracy of 87.43%, and its superior impervious surface classification performance is shown via comparative experiments. This provides a new, accurate, and more suitable extraction method for complex impervious surfaces
    corecore