10 research outputs found
Learning Grammars for Architecture-Specific Facade Parsing
International audienceParsing facade images requires optimal handcrafted grammar for a given class of buildings. Such a handcrafted grammar is often designed manually by experts. In this paper, we present a novel framework to learn a compact grammar from a set of ground-truth images. To this end, parse trees of ground-truth annotated images are obtained running existing inference algorithms with a simple, very general grammar. From these parse trees, repeated subtrees are sought and merged together to share derivations and produce a grammar with fewer rules. Furthermore, unsupervised clustering is performed on these rules, so that, rules corresponding to the same complex pattern are grouped together leading to a rich compact grammar. Experimental validation and comparison with the state-of-the-art grammar-based methods on four diff erent datasets show that the learned grammar helps in much faster convergence while producing equal or more accurate parsing results compared to handcrafted grammars as well as grammars learned by other methods. Besides, we release a new dataset of facade images from Paris following the Art-deco style and demonstrate the general applicability and extreme potential of the proposed framework
Eigengalaxies: Galaxy Morphology as a Linear Image Space and its Applications
In this thesis I contextualise the history of morphology as underpinned by Hubble's scheme, discrete
in nature, and deeply connected to theories of galaxy formation history. I set out in contrast, to describe
a purely empirical morphology, continuous in nature, in which surveys become image spaces and galaxies
become points, the meaning of which is sought by the quantifiable differences of their relative spatial
positions. I show how an image space can be robustly constructed and then build upon it to illustrate
important applications such as approximating surveys with small samples, detecting outliers, clustering,
similarity search and missing data prediction.
The thesis proceeds as follows. Section 1 briefly surveys the importance, genesis and recent history
of galaxy morphology. It also lays out the objectives of the thesis and information about the survey
data which I have used. Section 2 describes how galaxy images can be processed and projected to a
defensible low dimensional space in a morphology preserving way. Several analyses are then performed
to test the fidelity of the projection. It is also shown how the image space can be given a probabilistic
interpretation. Section 3 discusses methods for approximating surveys by reducing the number of objects
under consideration. The section starts by describing simple random sampling and its limitations. It then
shows how means and covariances can be used to summarise image spaces and how differences between
image spaces can be quantified using the Kullback-Leibler divergence. This concept is then used to apply
“leverage scores" sampling as a means to use information from the galaxy population to create a weighted
sampling scheme which preserves mean and covariance better than random sampling and therefore enables
much smaller representative samples. I also motivate and describe a cutting edge “coresets" methodology
which I intend to more fully explore in future work. Section 4 demonstrates parsimonious applications of
the image space framework to common use cases such as clustering, similarity search and outlier detection.
It is a modified and abridged version of a paper to be published in MNRAS with some modification.
Finally, section 5 draws summary conclusions and highlights important directions for the future
Holistic interpretation of visual data based on topology:semantic segmentation of architectural facades
The work presented in this dissertation is a step towards effectively incorporating contextual knowledge in the task of semantic segmentation. To date, the use of context has been confined to the genre of the scene with a few exceptions in the field. Research has been directed towards enhancing appearance descriptors. While this is unarguably important, recent studies show that computer vision has reached a near-human level of performance in relying on these descriptors when objects have stable distinctive surface properties and in proper imaging conditions. When these conditions are not met, humans exploit their knowledge about the intrinsic geometric layout of the scene to make local decisions. Computer vision lags behind when it comes to this asset. For this reason, we aim to bridge the gap by presenting algorithms for semantic segmentation of building facades making use of scene topological aspects. We provide a classification scheme to carry out segmentation and recognition simultaneously.The algorithm is able to solve a single optimization function and yield a semantic interpretation of facades, relying on the modeling power of probabilistic graphs and efficient discrete combinatorial optimization tools. We tackle the same problem of semantic facade segmentation with the neural network approach.We attain accuracy figures that are on-par with the state-of-the-art in a fully automated pipeline.Starting from pixelwise classifications obtained via Convolutional Neural Networks (CNN). These are then structurally validated through a cascade of Restricted Boltzmann Machines (RBM) and Multi-Layer Perceptron (MLP) that regenerates the most likely layout. In the domain of architectural modeling, there is geometric multi-model fitting. We introduce a novel guided sampling algorithm based on Minimum Spanning Trees (MST), which surpasses other propagation techniques in terms of robustness to noise. We make a number of additional contributions such as measure of model deviation which captures variations among fitted models