26,244 research outputs found

    "'Who are you?' - Learning person specific classifiers from video"

    Get PDF
    We investigate the problem of automatically labelling faces of characters in TV or movie material with their names, using only weak supervision from automaticallyaligned subtitle and script text. Our previous work (Everingham et al. [8]) demonstrated promising results on the task, but the coverage of the method (proportion of video labelled) and generalization was limited by a restriction to frontal faces and nearest neighbour classification. In this paper we build on that method, extending the coverage greatly by the detection and recognition of characters in profile views. In addition, we make the following contributions: (i) seamless tracking, integration and recognition of profile and frontal detections, and (ii) a character specific multiple kernel classifier which is able to learn the features best able to discriminate between the characters. We report results on seven episodes of the TV series “Buffy the Vampire Slayer”, demonstrating significantly increased coverage and performance with respect to previous methods on this material

    Combining multiple resolutions into hierarchical representations for kernel-based image classification

    Get PDF
    Geographic object-based image analysis (GEOBIA) framework has gained increasing interest recently. Following this popular paradigm, we propose a novel multiscale classification approach operating on a hierarchical image representation built from two images at different resolutions. They capture the same scene with different sensors and are naturally fused together through the hierarchical representation, where coarser levels are built from a Low Spatial Resolution (LSR) or Medium Spatial Resolution (MSR) image while finer levels are generated from a High Spatial Resolution (HSR) or Very High Spatial Resolution (VHSR) image. Such a representation allows one to benefit from the context information thanks to the coarser levels, and subregions spatial arrangement information thanks to the finer levels. Two dedicated structured kernels are then used to perform machine learning directly on the constructed hierarchical representation. This strategy overcomes the limits of conventional GEOBIA classification procedures that can handle only one or very few pre-selected scales. Experiments run on an urban classification task show that the proposed approach can highly improve the classification accuracy w.r.t. conventional approaches working on a single scale.Comment: International Conference on Geographic Object-Based Image Analysis (GEOBIA 2016), University of Twente in Enschede, The Netherland

    Compositional Model based Fisher Vector Coding for Image Classification

    Full text link
    Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) to depict the generation process of local features. However, the representative power of the GMM could be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes and the number of prototypes is usually small in FVC. To handle this limitation, in this paper we break the convention which assumes that a local feature is drawn from one of few Gaussian distributions. Instead, we adopt a compositional mechanism which assumes that a local feature is drawn from a Gaussian distribution whose mean vector is composed as the linear combination of multiple key components and the combination weight is a latent random variable. In this way, we can greatly enhance the representative power of the generative model of FVC. To implement our idea, we designed two particular generative models with such a compositional mechanism.Comment: Fixed typos. 16 pages. Appearing in IEEE T. Pattern Analysis and Machine Intelligence (TPAMI
    • …
    corecore