3 research outputs found

    Disjunctive normal shape Boltzmann machine

    Get PDF
    Shape Boltzmann machine (a type of Deep Boltzmann machine) is a powerful tool for shape modelling; however, has some drawbacks in representation of local shape parts. Disjunctive Normal Shape Model (DNSM) is a strong shape model that can effectively represent local parts of objects. In this paper, we propose a new shape model based on Shape Boltzmann Machine and Disjunctive Normal Shape Model which we call Disjunctive Normal Shape Boltzmann Machine (DNSBM). DNSBM learns binary distributions of shapes by taking both local and global shape constraints into account using a type of Deep Boltzmann Machine. The samples generated using DNSBM look realistic. Moreover, DNSBM is capable of generating novel samples that differ from training examples by exploiting the local shape representation capability of DNSM. We demonstrate the performance of DNSBM for shape completion on two different data sets in which exploitation of local shape parts is important for capturing the statistical variability of the underlying shape distributions. Experimental results show that DNSBM is a strong model for representing shapes that are composed of local parts

    Bayesian methods for segmentation of objects from multimodal and complex shape densities using statistical shape priors

    Get PDF
    In many image segmentation problems involving limited and low-quality data, employing statistical prior information about the shapes of the objects to be segmented can significantly improve the segmentation result. However, defining probability densities in the space of shapes is an open and challenging problem, especially if the object to be segmented comes from a shape density involving multiple modes (classes). In the literature, there are some techniques that exploit nonparametric shape priors to learn multimodal prior densities from a training set. These methods solve the problem of segmenting objects of limited and low-quality to some extent by performing maximum a posteriori (MAP) estimation. However, these methods assume that the boundaries found by using the observed data can provide at least a good initialization for MAP estimation so that convergence to a desired mode of the posterior density is achieved. There are two major problems with this assumption that we focus in this thesis. First, as the data provide less information, these approaches can get stuck at a local optimum which may not be the desired solution. Second, even though a good initialization directs the segmenting curve to a local optimum solution that looks like the desired segmentation, it does not provide a picture of other probable solutions, potentially from different modes of the posterior density, based on the data and the priors. In this thesis, we propose methods for segmentation of objects that come from multimodal posterior densities and suffer from severe noise, occlusion and missing data. The first framework that we propose represents the segmentation problem in terms of the joint posterior density of shapes and features. We incorporate the learned joint shape and feature prior distribution into a maximum a posteri- ori estimation framework for segmentation. In our second proposed framework, we approach the segmentation problem from the approximate Bayesian inference perspective. We propose two different Markov chain Monte Carlo (MCMC) sampling based image segmentation approaches that generates samples from the posterior density. As a final contribution of this thesis, we propose a new shape model that learns binary shape distributions by exploiting local shape priors and the Boltzmann machine. Although the proposed generative shape model has not been used in the context of object segmentation in this thesis, it has great potential to be used for this purpose. The source code of the methods introduced in this thesis will be available in https://github.com/eerdil

    Holistic interpretation of visual data based on topology:semantic segmentation of architectural facades

    Get PDF
    The work presented in this dissertation is a step towards effectively incorporating contextual knowledge in the task of semantic segmentation. To date, the use of context has been confined to the genre of the scene with a few exceptions in the field. Research has been directed towards enhancing appearance descriptors. While this is unarguably important, recent studies show that computer vision has reached a near-human level of performance in relying on these descriptors when objects have stable distinctive surface properties and in proper imaging conditions. When these conditions are not met, humans exploit their knowledge about the intrinsic geometric layout of the scene to make local decisions. Computer vision lags behind when it comes to this asset. For this reason, we aim to bridge the gap by presenting algorithms for semantic segmentation of building facades making use of scene topological aspects. We provide a classification scheme to carry out segmentation and recognition simultaneously.The algorithm is able to solve a single optimization function and yield a semantic interpretation of facades, relying on the modeling power of probabilistic graphs and efficient discrete combinatorial optimization tools. We tackle the same problem of semantic facade segmentation with the neural network approach.We attain accuracy figures that are on-par with the state-of-the-art in a fully automated pipeline.Starting from pixelwise classifications obtained via Convolutional Neural Networks (CNN). These are then structurally validated through a cascade of Restricted Boltzmann Machines (RBM) and Multi-Layer Perceptron (MLP) that regenerates the most likely layout. In the domain of architectural modeling, there is geometric multi-model fitting. We introduce a novel guided sampling algorithm based on Minimum Spanning Trees (MST), which surpasses other propagation techniques in terms of robustness to noise. We make a number of additional contributions such as measure of model deviation which captures variations among fitted models
    corecore