10,552 research outputs found
Weakly Supervised Learning of Objects, Attributes and Their Associations
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-10605-2_31]â
Weakly Supervised Learning of Objects and Attributes.
PhDThis thesis presents weakly supervised learning approaches to directly
exploit image-level tags (e.g. objects, attributes) for comprehensive
image understanding, including tasks such as object localisation, image
description, image retrieval, semantic segmentation, person re-identification
and person search, etc. Unlike the conventional approaches which tackle
weakly supervised problem by learning a discriminative model, a generative
Bayesian framework is proposed which provides better mechanisms
to resolve the ambiguity problem. The proposed model significantly differentiates
from the existing approaches in that: (1) All foreground object
classes are modelled jointly in a single generative model that encodes multiple
objects co-existence so that âexplaining awayâ inference can resolve
ambiguity and lead to better learning. (2) Image backgrounds are shared
across classes to better learn varying surroundings and âpush outâ objects
of interest. (3) the Bayesian formulation enables the exploitation of various
types of prior knowledge to compensate for the limited supervision
offered by weakly labelled data, as well as Bayesian domain adaptation
for transfer learning.
Detecting objects is the first and critical component in image understanding
paradigm. Unlike conventional fully supervised object detection
approaches, the proposed model aims to train an object detector
from weakly labelled data. A novel framework based on Bayesian latent
topic model is proposed to address the problem of localisation of objects
as bounding boxes in images and videos with image level object labels.
The inferred object location can be then used as the annotation to train a
classic object detector with conventional approaches.
However, objects cannot tell the whole story in an image. Beyond detecting
objects, a general visual model should be able to describe objects
and segment them at a pixel level. Another limitation of the initial model is
that it still requires an additional object detector. To remedy the above two
drawbacks, a novel weakly supervised non-parametric Bayesian model is
presented to model objects, attributes and their associations automatically
from weakly labelled images. Once learned, given a new image, the proposed
model can describe the image with the combination of objects and
attributes, as well as their locations and segmentation.
Finally, this thesis further tackles the weakly supervised learning problem
from a transfer learning perspective, by considering the fact that there
are always some fully labelled or weakly labelled data available in a related
domain while only insufficient labelled data exist for training in the
target domain. A powerful semantic description is transferred from the existing
fashion photography datasets to surveillance data to solve the person
re-identification problem
Learning Visual Attributes
We present a probabilistic generative model of visual attributes, together with an efficient learning algorithm. Attributes are visual qualities of objects, such as âredâ, âstripedâ, or âspottedâ. The model sees attributes as patterns of image segments, repeatedly sharing some characteristic properties. These can be any combination of appearance, shape, or the layout of segments within the pattern. Moreover, attributes with general appearance are taken into account, such as the pattern of alternation of any two colors which is characteristic for stripes. To enable learning from unsegmented training images, the model is learnt discriminatively, by optimizing a likelihood ratio. As demonstrated in the experimental evaluation, our model can learn in a weakly supervised setting and encompasses a broad range of attributes. We show that attributes can be learnt starting from a text query to Google image search, and can then be used to recognize the attribute and determine its spatial extent in novel real-world images.
LOCL: Learning Object-Attribute Composition using Localization
This paper describes LOCL (Learning Object Attribute Composition using
Localization) that generalizes composition zero shot learning to objects in
cluttered and more realistic settings. The problem of unseen Object Attribute
(OA) associations has been well studied in the field, however, the performance
of existing methods is limited in challenging scenes. In this context, our key
contribution is a modular approach to localizing objects and attributes of
interest in a weakly supervised context that generalizes robustly to unseen
configurations. Localization coupled with a composition classifier
significantly outperforms state of the art (SOTA) methods, with an improvement
of about 12% on currently available challenging datasets. Further, the
modularity enables the use of localized feature extractor to be used with
existing OA compositional learning methods to improve their overall
performance.Comment: 20 pages, 7 figures, 11 tables, Accepted in British Machine Vision
Conference 202
- âŚ