12 research outputs found

    Image Parsing with a Wide Range of Classes and Scene-Level Context

    Full text link
    This paper presents a nonparametric scene parsing approach that improves the overall accuracy, as well as the coverage of foreground classes in scene images. We first improve the label likelihood estimates at superpixels by merging likelihood scores from different probabilistic classifiers. This boosts the classification performance and enriches the representation of less-represented classes. Our second contribution consists of incorporating semantic context in the parsing process through global label costs. Our method does not rely on image retrieval sets but rather assigns a global likelihood estimate to each label, which is plugged into the overall energy function. We evaluate our system on two large-scale datasets, SIFTflow and LMSun. We achieve state-of-the-art performance on the SIFTflow dataset and near-record results on LMSun.Comment: Published at CVPR 2015, Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference o

    Sample and Filter: Nonparametric Scene Parsing via Efficient Filtering

    Get PDF
    Scene parsing has attracted a lot of attention in computer vision. While parametric models have proven effective for this task, they cannot easily incorporate new training data. By contrast, nonparametric approaches, which bypass any learning phase and directly transfer the labels from the training data to the query images, can readily exploit new labeled samples as they become available. Unfortunately, because of the computational cost of their label transfer procedures, state-of-the-art nonparametric methods typically filter out most training images to only keep a few relevant ones to label the query. As such, these methods throw away many images that still contain valuable information and generally obtain an unbalanced set of labeled samples. In this paper, we introduce a nonparametric approach to scene parsing that follows a sample-and-filter strategy. More specifically, we propose to sample labeled superpixels according to an image similarity score, which allows us to obtain a balanced set of samples. We then formulate label transfer as an efficient filtering procedure, which lets us exploit more labeled samples than existing techniques. Our experiments evidence the benefits of our approach over state-of-the-art nonparametric methods on two benchmark datasets.Comment: Please refer to the CVPR-2016 version of this manuscrip

    Görüntü ayrıştırma için özgün süperpiksel bölütleme algoritmaları geliştirilmesi

    Get PDF
    Süperpikseller görüntü bölütleme ve ayrıştırma problemlerinde yaygın olarak kullanılmaktadır. Sahne etiketlemede görüntü bir süperpiksel algoritması ile görsel olarak tutarlı küçük parçalara bölütlenmekte; daha sonra süperpikseller farklı sınıflara ayrıştırılmaktadır. Bu projede bölütleme ve etiketleme bütünsel bir bakış açısı ile ele alınarak görüntü ayrıştırmanın farklı adımları için özgün yaklaşımlar geliştirilmiştir. Yapılan çalışmalar, süperpikseller için alternatif bölütleme, öznitelik çıkarımı, sınıf-olabilirlik hesaplama ve bağlamsal modelleme yöntemleri geliştirilmesini kapsamaktadır. Projede öncelikle farklı bölütleme yöntem ve parametrelerinin etiketleme doğruluğu üzerindeki etkisi test edilmiştir. Daha sonra süperpiksel özniteliklerinin seçimi ve kodlanması, sınıf etiketlerinin olabilirlik hesabının modellenmesi üzerinde durulmuştur. Son olarak, alternatif bölütleme sonuçlarının kaynaştırılması için genelleştirilmiş bağlamsal modelleme yaklaşımı gelişirilmiştir. Önerilen yöntemler çeşitli anlambilimsel görüntü veritabanlarında test edilmiş ve eniyilenmiştir. Ayrıca projenin son döneminde, yapılan çalışmalar uydu görüntülerinden arazi örtüsü sınıflandırma problemine uyarlanmıştır. Benzetim sonuçları, farklı bölütleme yöntemlerindeki tümler bilginin doğru şekilde birleştirilmesiyle görüntü etiketleme doğruluğunda ciddi artışlar elde edildiğini ortaya koymuştur.Superpixels are widely used in image segmentation and parsing problems. In scene labeling, image is initially divided into visually consistent small pieces by using a superpixel algorithm; later superpixels are parsed into different classes. In this project, segmentation and labeling are considered together in a global perspective, and novel approaches are proposed for different steps of image parsing. In particular, several methods are developed for alternative segmentation, feature extraction, class-likelihood computation and contextual modeling of superpixels. Initially the effect of different segmentation methods and parameters on the labeling accuracy is thoroughly tested. Later superpixel feature selection and coding, modeling of likelihood computation for class label likelihoods are investigated. Finally a generalized contextual modeling framework is developed for the fusion of alternative segmentation results. The proposed methods are tested and optimized on several semantic image databases. In addition, in the final phase of the project, this work is adapted for the problem of land cover classification from satellite images. Simulation results show that it is possible to achieve substantial improvement in image labeling accuracy by accurate combination of complementary information coming from different segmentation methods.TÜBİTA

    Holistic interpretation of visual data based on topology:semantic segmentation of architectural facades

    Get PDF
    The work presented in this dissertation is a step towards effectively incorporating contextual knowledge in the task of semantic segmentation. To date, the use of context has been confined to the genre of the scene with a few exceptions in the field. Research has been directed towards enhancing appearance descriptors. While this is unarguably important, recent studies show that computer vision has reached a near-human level of performance in relying on these descriptors when objects have stable distinctive surface properties and in proper imaging conditions. When these conditions are not met, humans exploit their knowledge about the intrinsic geometric layout of the scene to make local decisions. Computer vision lags behind when it comes to this asset. For this reason, we aim to bridge the gap by presenting algorithms for semantic segmentation of building facades making use of scene topological aspects. We provide a classification scheme to carry out segmentation and recognition simultaneously.The algorithm is able to solve a single optimization function and yield a semantic interpretation of facades, relying on the modeling power of probabilistic graphs and efficient discrete combinatorial optimization tools. We tackle the same problem of semantic facade segmentation with the neural network approach.We attain accuracy figures that are on-par with the state-of-the-art in a fully automated pipeline.Starting from pixelwise classifications obtained via Convolutional Neural Networks (CNN). These are then structurally validated through a cascade of Restricted Boltzmann Machines (RBM) and Multi-Layer Perceptron (MLP) that regenerates the most likely layout. In the domain of architectural modeling, there is geometric multi-model fitting. We introduce a novel guided sampling algorithm based on Minimum Spanning Trees (MST), which surpasses other propagation techniques in terms of robustness to noise. We make a number of additional contributions such as measure of model deviation which captures variations among fitted models

    Hybrid machine learning approaches for scene understanding: From segmentation and recognition to image parsing

    Get PDF
    We alleviate the problem of semantic scene understanding by studies on object segmentation/recognition and scene labeling methods respectively. We propose new techniques for joint recognition, segmentation and pose estimation of infrared (IR) targets. The problem is formulated in a probabilistic level set framework where a shape constrained generative model is used to provide a multi-class and multi-view shape prior and where the shape model involves a couplet of view and identity manifolds (CVIM). A level set energy function is then iteratively optimized under the shape constraints provided by the CVIM. Since both the view and identity variables are expressed explicitly in the objective function, this approach naturally accomplishes recognition, segmentation and pose estimation as joint products of the optimization process. For realistic target chips, we solve the resulting multi-modal optimization problem by adopting a particle swarm optimization (PSO) algorithm and then improve the computational efficiency by implementing a gradient-boosted PSO (GB-PSO). Evaluation was performed using the Military Sensing Information Analysis Center (SENSIAC) ATR database, and experimental results show that both of the PSO algorithms reduce the cost of shape matching during CVIM-based shape inference. Particularly, GB-PSO outperforms other recent ATR algorithms, which require intensive shape matching, either explicitly (with pre-segmentation) or implicitly (without pre-segmentation). On the other hand, under situations when target boundaries are not obviously observed and object shapes are not preferably detected, we explored some sparse representation classification (SRC) methods on ATR applications, and developed a fusion technique that combines the traditional SRC and a group constrained SRC algorithm regulated by a sparsity concentration index for improved classification accuracy on the Comanche dataset. Moreover, we present a compact rare class-oriented scene labeling framework (RCSL) with a global scene assisted rare class retrieval process, where the retrieved subset was expanded by choosing scene regulated rare class patches. A complementary rare class balanced CNN is learned to alleviate imbalanced data distribution problem at lower cost. A superpixels-based re-segmentation was implemented to produce more perceptually meaningful object boundaries. Quantitative results demonstrate the promising performances of proposed framework on both pixel and class accuracy for scene labeling on the SIFTflow dataset, especially for rare class objects
    corecore