88 research outputs found
Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery
Fine-grained object recognition that aims to identify the type of an object
among a large number of subcategories is an emerging application with the
increasing resolution that exposes new details in image data. Traditional fully
supervised algorithms fail to handle this problem where there is low
between-class variance and high within-class variance for the classes of
interest with small sample sizes. We study an even more extreme scenario named
zero-shot learning (ZSL) in which no training example exists for some of the
classes. ZSL aims to build a recognition model for new unseen categories by
relating them to seen classes that were previously learned. We establish this
relation by learning a compatibility function between image features extracted
via a convolutional neural network and auxiliary information that describes the
semantics of the classes of interest by using training samples from the seen
classes. Then, we show how knowledge transfer can be performed for the unseen
classes by maximizing this function during inference. We introduce a new data
set that contains 40 different types of street trees in 1-ft spatial resolution
aerial data, and evaluate the performance of this model with manually annotated
attributes, a natural language model, and a scientific taxonomy as auxiliary
information. The experiments show that the proposed model achieves 14.3%
recognition accuracy for the classes with no training examples, which is
significantly better than a random guess accuracy of 6.3% for 16 test classes,
and three other ZSL algorithms.Comment: G. Sumbul, R. G. Cinbis, S. Aksoy, "Fine-Grained Object Recognition
and Zero-Shot Learning in Remote Sensing Imagery", IEEE Transactions on
Geoscience and Remote Sensing (TGRS), in press, 201
Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
Object category localization is a challenging problem in computer vision.
Standard supervised training requires bounding box annotations of object
instances. This time-consuming annotation process is sidestepped in weakly
supervised learning. In this case, the supervised information is restricted to
binary labels that indicate the absence/presence of object instances in the
image, without their locations. We follow a multiple-instance learning approach
that iteratively trains the detector and infers the object locations in the
positive training images. Our main contribution is a multi-fold multiple
instance learning procedure, which prevents training from prematurely locking
onto erroneous object locations. This procedure is particularly important when
using high-dimensional representations, such as Fisher vectors and
convolutional neural network features. We also propose a window refinement
method, which improves the localization accuracy by incorporating an objectness
prior. We present a detailed experimental evaluation using the PASCAL VOC 2007
dataset, which verifies the effectiveness of our approach.Comment: To appear in IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI
Attributes2Classname: A discriminative model for attribute-based unsupervised zero-shot learning
We propose a novel approach for unsupervised zero-shot learning (ZSL) of
classes based on their names. Most existing unsupervised ZSL methods aim to
learn a model for directly comparing image features and class names. However,
this proves to be a difficult task due to dominance of non-visual semantics in
underlying vector-space embeddings of class names. To address this issue, we
discriminatively learn a word representation such that the similarities between
class and combination of attribute names fall in line with the visual
similarity. Contrary to the traditional zero-shot learning approaches that are
built upon attribute presence, our approach bypasses the laborious
attribute-class relation annotations for unseen classes. In addition, our
proposed approach renders text-only training possible, hence, the training can
be augmented without the need to collect additional image data. The
experimental results show that our method yields state-of-the-art results for
unsupervised ZSL in three benchmark datasets.Comment: To appear at IEEE Int. Conference on Computer Vision (ICCV) 201
Zero-Shot Object Detection by Hybrid Region Embedding
Object detection is considered as one of the most challenging problems in
computer vision, since it requires correct prediction of both classes and
locations of objects in images. In this study, we define a more difficult
scenario, namely zero-shot object detection (ZSD) where no visual training data
is available for some of the target object classes. We present a novel approach
to tackle this ZSD problem, where a convex combination of embeddings are used
in conjunction with a detection framework. For evaluation of ZSD methods, we
propose a simple dataset constructed from Fashion-MNIST images and also a
custom zero-shot split for the Pascal VOC detection challenge. The experimental
results suggest that our method yields promising results for ZSD
Robust and Reliable Stochastic Resource Allocation via Tail Waterfilling
Stochastic allocation of resources in the context of wireless systems
ultimately demands reactive decision making for meaningfully optimizing
network-wide random utilities, while respecting certain resource constraints.
Standard ergodic-optimal policies are however susceptible to the statistical
variability of fading, often leading to systems which are severely unreliable
and spectrally wasteful. On the flip side, minimax/outage-optimal policies are
too pessimistic and often hard to determine. We propose a new risk-aware
formulation of the resource allocation problem for standard multi-user
point-to-point power-constrained communication with no cross-interference, by
employing the Conditional Value-at-Risk (CV@R) as a measure of fading risk. A
remarkable feature of this approach is that it is a convex generalization of
the ergodic setting while inducing robustness and reliability in a fully
tunable way, thus bridging the gap between the (naive) ergodic and
(conservative) minimax approaches. We provide a closed-form expression for the
CV@R-optimal policy given primal/dual variables, extending the classical
stochastic waterfilling policy. We then develop a primal-dual tail-waterfilling
scheme to recursively learn a globally optimal risk-aware policy. The
effectiveness of the approach is verified via detailed simulations.Comment: 5 pages, 7 figure
Cross-task weakly supervised learning from instructional videos
In this paper we investigate learning visual models for the steps of ordinary
tasks using weak supervision via instructional narrations and an ordered list
of steps instead of strong supervision via temporal annotations. At the heart
of our approach is the observation that weakly supervised learning may be
easier if a model shares components while learning different steps: `pour egg'
should be trained jointly with other tasks involving `pour' and `egg'. We
formalize this in a component model for recognizing steps and a weakly
supervised learning framework that can learn this model under temporal
constraints from narration and the list of steps. Past data does not permit
systematic studying of sharing and so we also gather a new dataset, CrossTask,
aimed at assessing cross-task sharing. Our experiments demonstrate that sharing
across tasks improves performance, especially when done at the component level
and that our component model can parse previously unseen tasks by virtue of its
compositionality.Comment: 18 pages, 17 figures, to be published in proceedings of the CVPR,
201
High-level efficient constraint dominance programming for pattern mining problems
Pattern mining is a sub-field of data mining that focuses on discovering patterns in data to extract knowledge. There are various techniques to identify different types of patterns in a dataset. Constraint-based mining is a well-known approach to this where additional constraints are introduced to retrieve only interesting patterns. However, in these systems, there are limitations on imposing complex constraints.
Constraint programming is a declarative methodology where the problem is modelled using constraints. Generic solvers can operate on a model to find the solutions. Constraint programming has been shown to be a well-suited and generic framework for various pattern mining problems with a selection of constraints and their combinations. However, a system that handles arbitrary constraints in a generic way has been missing in this field.
In this thesis, we propose a declarative framework where the pattern mining models can be represented in high-level constraint specifications with arbitrary additional constraints. These models can be efficiently solved using underlying optimisations.
The first contribution of this thesis is to determine the key aspects of solving pattern mining problems by creating an ad-hoc solver system. We investigate this further and create Constraint Dominance Programming (CDP) to be able to capture certain behaviours of pattern mining problems in an abstract way. To that end, we integrate CDP into the high-level \essence pipeline. Early empirical evaluation presents that CDP is already competitive with current existing techniques. The second contribution of this thesis is to exploit an additional behaviour, the incomparability, in pattern mining problems. By including the incomparability condition to CDP, we create CDP+I, a more explicit and even more efficient framework to represent these problems. We also prototype an automated system to deduct the optimal incomparability information for a given modelled problem. The third contribution of this thesis is to focus on the underlying solving of CDP+I to bring further efficiency. By creating the Solver Interactive Interface (SII) on SAT and SMT back-ends, we highly optimise not only CDP+I but any iterative modelling and solving, such as optimisation problems. The final contribution of this thesis is to investigate creating an automated configuration selection system to determine the best performing solving methodologies of CDP+I and introduce a portfolio of configurations that can perform better than any single best solver.
In summary, this thesis presents a highly efficient, high-level declarative framework to tackle pattern mining problems
- …