10,552 research outputs found

    Weakly Supervised Learning of Objects, Attributes and Their Associations

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-10605-2_31]”

    Weakly Supervised Learning of Objects and Attributes.

    Get PDF
    PhDThis thesis presents weakly supervised learning approaches to directly exploit image-level tags (e.g. objects, attributes) for comprehensive image understanding, including tasks such as object localisation, image description, image retrieval, semantic segmentation, person re-identification and person search, etc. Unlike the conventional approaches which tackle weakly supervised problem by learning a discriminative model, a generative Bayesian framework is proposed which provides better mechanisms to resolve the ambiguity problem. The proposed model significantly differentiates from the existing approaches in that: (1) All foreground object classes are modelled jointly in a single generative model that encodes multiple objects co-existence so that “explaining away” inference can resolve ambiguity and lead to better learning. (2) Image backgrounds are shared across classes to better learn varying surroundings and “push out” objects of interest. (3) the Bayesian formulation enables the exploitation of various types of prior knowledge to compensate for the limited supervision offered by weakly labelled data, as well as Bayesian domain adaptation for transfer learning. Detecting objects is the first and critical component in image understanding paradigm. Unlike conventional fully supervised object detection approaches, the proposed model aims to train an object detector from weakly labelled data. A novel framework based on Bayesian latent topic model is proposed to address the problem of localisation of objects as bounding boxes in images and videos with image level object labels. The inferred object location can be then used as the annotation to train a classic object detector with conventional approaches. However, objects cannot tell the whole story in an image. Beyond detecting objects, a general visual model should be able to describe objects and segment them at a pixel level. Another limitation of the initial model is that it still requires an additional object detector. To remedy the above two drawbacks, a novel weakly supervised non-parametric Bayesian model is presented to model objects, attributes and their associations automatically from weakly labelled images. Once learned, given a new image, the proposed model can describe the image with the combination of objects and attributes, as well as their locations and segmentation. Finally, this thesis further tackles the weakly supervised learning problem from a transfer learning perspective, by considering the fact that there are always some fully labelled or weakly labelled data available in a related domain while only insufficient labelled data exist for training in the target domain. A powerful semantic description is transferred from the existing fashion photography datasets to surveillance data to solve the person re-identification problem

    Learning Visual Attributes

    Get PDF
    We present a probabilistic generative model of visual attributes, together with an efficient learning algorithm. Attributes are visual qualities of objects, such as ‘red’, ‘striped’, or ‘spotted’. The model sees attributes as patterns of image segments, repeatedly sharing some characteristic properties. These can be any combination of appearance, shape, or the layout of segments within the pattern. Moreover, attributes with general appearance are taken into account, such as the pattern of alternation of any two colors which is characteristic for stripes. To enable learning from unsegmented training images, the model is learnt discriminatively, by optimizing a likelihood ratio. As demonstrated in the experimental evaluation, our model can learn in a weakly supervised setting and encompasses a broad range of attributes. We show that attributes can be learnt starting from a text query to Google image search, and can then be used to recognize the attribute and determine its spatial extent in novel real-world images.

    LOCL: Learning Object-Attribute Composition using Localization

    Full text link
    This paper describes LOCL (Learning Object Attribute Composition using Localization) that generalizes composition zero shot learning to objects in cluttered and more realistic settings. The problem of unseen Object Attribute (OA) associations has been well studied in the field, however, the performance of existing methods is limited in challenging scenes. In this context, our key contribution is a modular approach to localizing objects and attributes of interest in a weakly supervised context that generalizes robustly to unseen configurations. Localization coupled with a composition classifier significantly outperforms state of the art (SOTA) methods, with an improvement of about 12% on currently available challenging datasets. Further, the modularity enables the use of localized feature extractor to be used with existing OA compositional learning methods to improve their overall performance.Comment: 20 pages, 7 figures, 11 tables, Accepted in British Machine Vision Conference 202
    • …
    corecore