88 research outputs found

    Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery

    Full text link
    Fine-grained object recognition that aims to identify the type of an object among a large number of subcategories is an emerging application with the increasing resolution that exposes new details in image data. Traditional fully supervised algorithms fail to handle this problem where there is low between-class variance and high within-class variance for the classes of interest with small sample sizes. We study an even more extreme scenario named zero-shot learning (ZSL) in which no training example exists for some of the classes. ZSL aims to build a recognition model for new unseen categories by relating them to seen classes that were previously learned. We establish this relation by learning a compatibility function between image features extracted via a convolutional neural network and auxiliary information that describes the semantics of the classes of interest by using training samples from the seen classes. Then, we show how knowledge transfer can be performed for the unseen classes by maximizing this function during inference. We introduce a new data set that contains 40 different types of street trees in 1-ft spatial resolution aerial data, and evaluate the performance of this model with manually annotated attributes, a natural language model, and a scientific taxonomy as auxiliary information. The experiments show that the proposed model achieves 14.3% recognition accuracy for the classes with no training examples, which is significantly better than a random guess accuracy of 6.3% for 16 test classes, and three other ZSL algorithms.Comment: G. Sumbul, R. G. Cinbis, S. Aksoy, "Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery", IEEE Transactions on Geoscience and Remote Sensing (TGRS), in press, 201

    Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

    Get PDF
    Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this case, the supervised information is restricted to binary labels that indicate the absence/presence of object instances in the image, without their locations. We follow a multiple-instance learning approach that iteratively trains the detector and infers the object locations in the positive training images. Our main contribution is a multi-fold multiple instance learning procedure, which prevents training from prematurely locking onto erroneous object locations. This procedure is particularly important when using high-dimensional representations, such as Fisher vectors and convolutional neural network features. We also propose a window refinement method, which improves the localization accuracy by incorporating an objectness prior. We present a detailed experimental evaluation using the PASCAL VOC 2007 dataset, which verifies the effectiveness of our approach.Comment: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI

    Attributes2Classname: A discriminative model for attribute-based unsupervised zero-shot learning

    Full text link
    We propose a novel approach for unsupervised zero-shot learning (ZSL) of classes based on their names. Most existing unsupervised ZSL methods aim to learn a model for directly comparing image features and class names. However, this proves to be a difficult task due to dominance of non-visual semantics in underlying vector-space embeddings of class names. To address this issue, we discriminatively learn a word representation such that the similarities between class and combination of attribute names fall in line with the visual similarity. Contrary to the traditional zero-shot learning approaches that are built upon attribute presence, our approach bypasses the laborious attribute-class relation annotations for unseen classes. In addition, our proposed approach renders text-only training possible, hence, the training can be augmented without the need to collect additional image data. The experimental results show that our method yields state-of-the-art results for unsupervised ZSL in three benchmark datasets.Comment: To appear at IEEE Int. Conference on Computer Vision (ICCV) 201

    Zero-Shot Object Detection by Hybrid Region Embedding

    Full text link
    Object detection is considered as one of the most challenging problems in computer vision, since it requires correct prediction of both classes and locations of objects in images. In this study, we define a more difficult scenario, namely zero-shot object detection (ZSD) where no visual training data is available for some of the target object classes. We present a novel approach to tackle this ZSD problem, where a convex combination of embeddings are used in conjunction with a detection framework. For evaluation of ZSD methods, we propose a simple dataset constructed from Fashion-MNIST images and also a custom zero-shot split for the Pascal VOC detection challenge. The experimental results suggest that our method yields promising results for ZSD

    Robust and Reliable Stochastic Resource Allocation via Tail Waterfilling

    Full text link
    Stochastic allocation of resources in the context of wireless systems ultimately demands reactive decision making for meaningfully optimizing network-wide random utilities, while respecting certain resource constraints. Standard ergodic-optimal policies are however susceptible to the statistical variability of fading, often leading to systems which are severely unreliable and spectrally wasteful. On the flip side, minimax/outage-optimal policies are too pessimistic and often hard to determine. We propose a new risk-aware formulation of the resource allocation problem for standard multi-user point-to-point power-constrained communication with no cross-interference, by employing the Conditional Value-at-Risk (CV@R) as a measure of fading risk. A remarkable feature of this approach is that it is a convex generalization of the ergodic setting while inducing robustness and reliability in a fully tunable way, thus bridging the gap between the (naive) ergodic and (conservative) minimax approaches. We provide a closed-form expression for the CV@R-optimal policy given primal/dual variables, extending the classical stochastic waterfilling policy. We then develop a primal-dual tail-waterfilling scheme to recursively learn a globally optimal risk-aware policy. The effectiveness of the approach is verified via detailed simulations.Comment: 5 pages, 7 figure

    Cross-task weakly supervised learning from instructional videos

    Get PDF
    In this paper we investigate learning visual models for the steps of ordinary tasks using weak supervision via instructional narrations and an ordered list of steps instead of strong supervision via temporal annotations. At the heart of our approach is the observation that weakly supervised learning may be easier if a model shares components while learning different steps: `pour egg' should be trained jointly with other tasks involving `pour' and `egg'. We formalize this in a component model for recognizing steps and a weakly supervised learning framework that can learn this model under temporal constraints from narration and the list of steps. Past data does not permit systematic studying of sharing and so we also gather a new dataset, CrossTask, aimed at assessing cross-task sharing. Our experiments demonstrate that sharing across tasks improves performance, especially when done at the component level and that our component model can parse previously unseen tasks by virtue of its compositionality.Comment: 18 pages, 17 figures, to be published in proceedings of the CVPR, 201

    High-level efficient constraint dominance programming for pattern mining problems

    Get PDF
    Pattern mining is a sub-field of data mining that focuses on discovering patterns in data to extract knowledge. There are various techniques to identify different types of patterns in a dataset. Constraint-based mining is a well-known approach to this where additional constraints are introduced to retrieve only interesting patterns. However, in these systems, there are limitations on imposing complex constraints. Constraint programming is a declarative methodology where the problem is modelled using constraints. Generic solvers can operate on a model to find the solutions. Constraint programming has been shown to be a well-suited and generic framework for various pattern mining problems with a selection of constraints and their combinations. However, a system that handles arbitrary constraints in a generic way has been missing in this field. In this thesis, we propose a declarative framework where the pattern mining models can be represented in high-level constraint specifications with arbitrary additional constraints. These models can be efficiently solved using underlying optimisations. The first contribution of this thesis is to determine the key aspects of solving pattern mining problems by creating an ad-hoc solver system. We investigate this further and create Constraint Dominance Programming (CDP) to be able to capture certain behaviours of pattern mining problems in an abstract way. To that end, we integrate CDP into the high-level \essence pipeline. Early empirical evaluation presents that CDP is already competitive with current existing techniques. The second contribution of this thesis is to exploit an additional behaviour, the incomparability, in pattern mining problems. By including the incomparability condition to CDP, we create CDP+I, a more explicit and even more efficient framework to represent these problems. We also prototype an automated system to deduct the optimal incomparability information for a given modelled problem. The third contribution of this thesis is to focus on the underlying solving of CDP+I to bring further efficiency. By creating the Solver Interactive Interface (SII) on SAT and SMT back-ends, we highly optimise not only CDP+I but any iterative modelling and solving, such as optimisation problems. The final contribution of this thesis is to investigate creating an automated configuration selection system to determine the best performing solving methodologies of CDP+I and introduce a portfolio of configurations that can perform better than any single best solver. In summary, this thesis presents a highly efficient, high-level declarative framework to tackle pattern mining problems
    • …
    corecore