6 research outputs found

    Vector quantizing feature space with a regular lattice

    Get PDF
    International audienceMost recent class-level object recognition systems work with visual words, i.e., vector quantized local descriptors. In this paper we examine the feasibility of a data- independent approach to construct such a visual vocabulary, where the feature space is discretized using a regular lattice. Using hashing techniques, only non-empty bins are stored, and fine-grained grids become possible in spite of the high dimensionality of typical feature spaces. Based on this representation, we can explore the structure of the feature space, and obtain state-of-the-art pixelwise classification results. In the case of image classification, we introduce a class-specific feature selection step, which takes the spatial structure of SIFT-like descriptors into account. Results are reported on the Graz02 dataset

    Statistical cues for domain specific image segmentation with performance analysis

    No full text
    This paper investigates the use of colour and texture cues for segmentation of images within two specified domains. The first is the Sowerby dataset, which contains one hundred colour photographs of country roads in England that have been interactively segmented and classified into six classes – edge, vegetation, air, road, building, and other. The second domain is a set of thirty five images, taken in San Francisco, which have been interactively segmented into similar classes. In each domain we learn the joint probability distributions of filter responses, based on colour and texture, for each class. These distributions are then used for classification. We restrict ourselves to a limited number of filters in order to ensure that the learnt filter responses do not overfit the training data (our region classes are chosen so as to ensure that there is enough data to avoid overfitting). We do performance analysis on the two datasets by evaluating the false positive and false negative error rates for the classification. This shows that the learnt models achieve high accuracy in classifying individual pixels into those classes for which the filter responses are approximately spatially homogeneous (i.e. road, vegetation, and air but not edge and building). A more sensitive performance measure, the Chernoff information, is calculated in order to quantify how well the cues for edge and building are doing. This demonstrates that statistical knowledge of the domain is a powerful tool for segmentation

    Context-driven Object Detection and Segmentation with Auxiliary Information

    No full text
    One fundamental problem in computer vision and robotics is to localize objects of interest in an image. The task can either be formulated as an object detection problem if the objects are described by a set of pose parameters, or an object segmentation one if we recover object boundary precisely. A key issue in object detection and segmentation concerns exploiting the spatial context, as local evidence is often insufficient to determine object pose in the presence of heavy occlusions or large object appearance variations. This thesis addresses the object detection and segmentation problem in such adverse conditions with auxiliary depth data provided by RGBD cameras. We focus on four main issues in context-aware object detection and segmentation: 1) what are the effective context representations? 2) how can we work with limited and imperfect depth data? 3) how to design depth-aware features and integrate depth cues into conventional visual inference tasks? 4) how to make use of unlabeled data to relax the labeling requirements for training data? We discuss three object detection and segmentation scenarios based on varying amounts of available auxiliary information. In the first case, depth data are available for model training but not available for testing. We propose a structured Hough voting method for detecting objects with heavy occlusion in indoor environments, in which we extend the Hough hypothesis space to include both the object's location, and its visibility pattern. We design a new score function that accumulates votes for object detection and occlusion prediction. In addition, we explore the correlation between objects and their environment, building a depth-encoded object-context model based on RGBD data. In the second case, we address the problem of localizing glass objects with noisy and incomplete depth data. Our method integrates the intensity and depth information from a single view point, and builds a Markov Random Field that predicts glass boundary and region jointly. In addition, we propose a nonparametric, data-driven label transfer scheme for local glass boundary estimation. A weighted voting scheme based on a joint feature manifold is adopted to integrate depth and appearance cues, and we learn a distance metric on the depth-encoded feature manifold. In the third case, we make use of unlabeled data to relax the annotation requirements for object detection and segmentation, and propose a novel data-dependent margin distribution learning criterion for boosting, which utilizes the intrinsic geometric structure of datasets. One key aspect of this method is that it can seamlessly incorporate unlabeled data by including a graph Laplacian regularizer. We demonstrate the performance of our models and compare with baseline methods on several real-world object detection and segmentation tasks, including indoor object detection, glass object segmentation and foreground segmentation in video
    corecore