330 research outputs found
Stel Component Analysis: Joint Segmentation, Modeling and Recognition of Objects Classes
Models that captures the common structure of an object class have appeared few years ago in the literature (Jojic and Caspi in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 212---219, 2004; Winn and Jojic in Proceedings of International Conference on Computer Vision (ICCV), pp. 756---763, 2005); they are often referred as "stel models." Their main characteristic is to segment objects in clear, often semantic, parts as a consequence of the modeling constraint which forces the regions belonging to a single segment to have a tight distribution over local measurements, such as color or texture. This self-similarity within a region in a single image is typical of many meaningful image parts, even when across different images of similar objects, the corresponding parts may not have similar local measurements. Moreover, the segmentation itself is expected to be consistent within a class, although still flexible. These models have been applied mostly to segmentation scenarios. In this paper, we extent those ideas (1) proposing to capture correlations that exist in structural elements of an image class due to global effects, (2) exploiting the segmentations to capture feature co-occurrences and (3) allowing the use of multiple, eventually sparse, observation of different nature. In this way we obtain richer models more suitable to recognition tasks. We accomplish these requirements using a novel approach we dubbed stel component analysis. Experimental results show the flexibility of the model as it can deal successfully with image/video segmentation and object recognition where, in particular, it can be used as an alternative of, or in conjunction with, bag-of-features and related classifiers, where stel inference provides a meaningful spatial partition of features
Quantifying aesthetics of visual design applied to automatic design
In today\u27s Instagram world, with advances in ubiquitous computing and access to social networks, digital media is adopted by art and culture. In this dissertation, we study what makes a good design by investigating mechanisms to bring aesthetics of design from realm of subjection to objection. These mechanisms are a combination of three main approaches: learning theories and principles of design by collaborating with professional designers, mathematically and statistically modeling good designs from large scale datasets, and crowdscourcing to model perceived aesthetics of designs from general public responses. We then apply the knowledge gained in automatic design creation tools to help non-designers in self-publishing, and designers in inspiration and creativity. Arguably, unlike visual arts where the main goals may be abstract, visual design is conceptualized and created to convey a message and communicate with audiences. Therefore, we develop a semantic design mining framework to automatically link the design elements, layout, color, typography, and photos to linguistic concepts. The inferred semantics are applied to a design expert system to leverage user interactions in order to create personalized designs via recommendation algorithms based on the user\u27s preferences
Learning Material-Aware Local Descriptors for 3D Shapes
Material understanding is critical for design, geometric modeling, and
analysis of functional objects. We enable material-aware 3D shape analysis by
employing a projective convolutional neural network architecture to learn
material- aware descriptors from view-based representations of 3D points for
point-wise material classification or material- aware retrieval. Unfortunately,
only a small fraction of shapes in 3D repositories are labeled with physical
mate- rials, posing a challenge for learning methods. To address this
challenge, we crowdsource a dataset of 3080 3D shapes with part-wise material
labels. We focus on furniture models which exhibit interesting structure and
material variabil- ity. In addition, we also contribute a high-quality expert-
labeled benchmark of 115 shapes from Herman-Miller and IKEA for evaluation. We
further apply a mesh-aware con- ditional random field, which incorporates
rotational and reflective symmetries, to smooth our local material predic-
tions across neighboring surface patches. We demonstrate the effectiveness of
our learned descriptors for automatic texturing, material-aware retrieval, and
physical simulation. The dataset and code will be publicly available.Comment: 3DV 201
Synthesizing and Editing Photo-realistic Visual Objects
In this thesis we investigate novel methods of synthesizing new images of a deformable visual object using a collection of images of the object. We investigate both parametric and non-parametric methods as well as a combination of the two methods for the problem of image synthesis. Our main focus are complex visual objects, specifically deformable objects and objects with varying numbers of visible parts. We first introduce sketch-driven image synthesis system, which allows the user to draw ellipses and outlines in order to sketch a rough shape of animals as a constraint to the synthesized image. This system interactively provides feedback in the form of ellipse and contour suggestions to the partial sketch of the user. The user's sketch guides the non-parametric synthesis algorithm that blends patches from two exemplar images in a coarse-to-fine fashion to create a final image. We evaluate the method and synthesized images through two user studies. Instead of non-parametric blending of patches, a parametric model of the appearance is more desirable as its appearance representation is shared between all images of the dataset. Hence, we propose Context-Conditioned Component Analysis, a probabilistic generative parametric model, which described images with a linear combination of basis functions. The basis functions are evaluated for each pixel using a context vector computed from the local shape information. We evaluate C-CCA qualitatively and quantitatively on inpainting, appearance transfer and reconstruction tasks. Drawing samples of C-CCA generates novel, globally-coherent images, which, unfortunately, lack high-frequency details due to dimensionality reduction and misalignment. We develop a non-parametric model that enhances the samples of C-CCA with locally-coherent, high-frequency details. The non-parametric model efficiently finds patches from the dataset that match the C-CCA sample and blends the patches together. We analyze the results of the combined method on the datasets of horse and elephant images
Medical image synthesis using generative adversarial networks: towards photo-realistic image synthesis
This proposed work addresses the photo-realism for synthetic images. We introduced a modified generative adversarial network: StencilGAN. It is a perceptually-aware generative adversarial network that synthesizes images based on overlaid labelled masks. This technique can be a prominent solution for the scarcity of the resources in the healthcare sector
Music as complex emergent behaviour : an approach to interactive music systems
Access to the full-text thesis is no longer available at the author's request, due to 3rd party copyright restrictions. Access removed on 28.11.2016 by CS (TIS).Metadata merged with duplicate record (http://hdl.handle.net/10026.1/770) on 20.12.2016 by CS (TIS).This is a digitised version of a thesis that was deposited in the University Library. If you are the author please contact PEARL Admin ([email protected]) to discuss options.This thesis suggests a new model of human-machine interaction in the domain of non-idiomatic
musical improvisation. Musical results are viewed as emergent phenomena
issuing from complex internal systems behaviour in relation to input from a single
human performer. We investigate the prospect of rewarding interaction whereby a
system modifies itself in coherent though non-trivial ways as a result of exposure to a
human interactor. In addition, we explore whether such interactions can be sustained
over extended time spans. These objectives translate into four criteria for evaluation;
maximisation of human influence, blending of human and machine influence in the
creation of machine responses, the maintenance of independent machine motivations
in order to support machine autonomy and finally, a combination of global emergent
behaviour and variable behaviour in the long run. Our implementation is heavily
inspired by ideas and engineering approaches from the discipline of Artificial Life.
However, we also address a collection of representative existing systems from the
field of interactive composing, some of which are implemented using techniques of
conventional Artificial Intelligence. All systems serve as a contextual background and
comparative framework helping the assessment of the work reported here.
This thesis advocates a networked model incorporating functionality for listening,
playing and the synthesis of machine motivations. The latter incorporate dynamic
relationships instructing the machine to either integrate with a musical context
suggested by the human performer or, in contrast, perform as an individual musical
character irrespective of context. Techniques of evolutionary computing are used to
optimise system components over time. Evolution proceeds based on an implicit
fitness measure; the melodic distance between consecutive musical statements made
by human and machine in relation to the currently prevailing machine motivation.
A substantial number of systematic experiments reveal complex emergent behaviour
inside and between the various systems modules. Music scores document how global
systems behaviour is rendered into actual musical output. The concluding chapter
offers evidence of how the research criteria were accomplished and proposes
recommendations for future research
Quality Assessment and Variance Reduction in Monte Carlo Rendering Algorithms
Over the past few decades much work has been focused on the area of physically based rendering which attempts to produce images that are indistinguishable from natural images such as photographs. Physically based rendering algorithms simulate the complex interactions of light with physically based material, light source, and camera models by structuring it as complex high dimensional integrals [Kaj86] which do not have a closed form solution. Stochastic processes such as Monte Carlo methods can be structured to approximate the expectation of these integrals, producing algorithms which converge to the true rendering solution as the amount of computation is increased in the limit.When a finite amount of computation is used to approximate the rendering solution, images will contain undesirable distortions in the form of noise from under-sampling in image regions with complex light interactions. An important aspect of developing algorithms in this domain is to have a means of accurately comparing and contrasting the relative performance gains between different approaches. Image Quality Assessment (IQA) measures provide a way of condensing the high dimensionality of image data to a single scalar value which can be used as a representative measure of image quality and fidelity. These measures are largely developed in the context of image datasets containing natural images (photographs) coupled with their synthetically distorted versions, and quality assessment scores given by human observers under controlled viewing conditions. Inference using these measures therefore relies on whether the synthetic distortions used to develop the IQA measures are representative of the natural distortions that will be seen in images from domain being assessed.When we consider images generated through stochastic rendering processes, the structure of visible distortions that are present in un-converged images is highly complex and spatially varying based on lighting and scene composition. In this domain the simple synthetic distortions used commonly to train and evaluate IQA measures are not representative of the complex natural distortions from the rendering process. This raises a question of how robust IQA measures are when applied to physically based rendered images.In this thesis we summarize the classical and recent works in the area of physicallybased rendering using stochastic approaches such as Monte Carlo methods. We develop a modern C++ framework wrapping MPI for managing and running code on large scale distributed computing environments. With this framework we use high performance computing to generate a dataset of Monte Carlo images. From this we provide a study on the effectiveness of modern and classical IQA measures and their robustness when evaluating images generated through stochastic rendering processes. Finally, we build on the strengths of these IQA measures and apply modern deep-learning methods to the No Reference IQA problem, where we wish to assess the quality of a rendered image without knowing its true value
- …