78,527 research outputs found
Multi-texture image segmentation
Visual perception of images is closely related to the recognition of the different
texture areas within an image. Identifying the boundaries of these regions is an important
step in image analysis and image understanding. This thesis presents supervised and
unsupervised methods which allow an efficient segmentation of the texture regions within
multi-texture images.
The features used by the methods are based on a measure of the fractal dimension
of surfaces in several directions, which allows the transformation of the image into a set
of feature images, however no direct measurement of the fractal dimension is made. Using
this set of features, supervised and unsupervised, statistical processing schemes are
presented which produce low classification error rates. Natural texture images are
examined with particular application to the analysis of sonar images of the seabed.
A number of processes based on fractal models for texture synthesis are also
presented. These are used to produce realistic images of natural textures, again with
particular reference to sonar images of the seabed, and which show the importance of
phase and directionality in our perception of texture. A further extension is shown to give
possible uses for image coding and object identification
Perception Driven Texture Generation
This paper investigates a novel task of generating texture images from
perceptual descriptions. Previous work on texture generation focused on either
synthesis from examples or generation from procedural models. Generating
textures from perceptual attributes have not been well studied yet. Meanwhile,
perceptual attributes, such as directionality, regularity and roughness are
important factors for human observers to describe a texture. In this paper, we
propose a joint deep network model that combines adversarial training and
perceptual feature regression for texture generation, while only random noise
and user-defined perceptual attributes are required as input. In this model, a
preliminary trained convolutional neural network is essentially integrated with
the adversarial framework, which can drive the generated textures to possess
given perceptual attributes. An important aspect of the proposed model is that,
if we change one of the input perceptual features, the corresponding appearance
of the generated textures will also be changed. We design several experiments
to validate the effectiveness of the proposed method. The results show that the
proposed method can produce high quality texture images with desired perceptual
properties.Comment: 7 pages, 4 figures, icme201
Relating Objective and Subjective Performance Measures for AAM-based Visual Speech Synthesizers
We compare two approaches for synthesizing visual speech using Active Appearance Models (AAMs): one that utilizes acoustic features as input, and one that utilizes a phonetic transcription as input. Both synthesizers are trained using the same data and the performance is measured using both objective and subjective testing. We investigate the impact of likely sources of error in the synthesized visual speech by introducing typical errors into real visual speech sequences and subjectively measuring the perceived degradation. When only a small region (e.g. a single syllable) of ground-truth visual speech is incorrect we find that the subjective score for the entire sequence is subjectively lower than sequences generated by our synthesizers. This observation motivates further consideration of an often ignored issue, which is to what extent are subjective measures correlated with objective measures of performance? Significantly, we find that the most commonly used objective measures of performance are not necessarily the best indicator of viewer perception of quality. We empirically evaluate alternatives and show that the cost of a dynamic time warp of synthesized visual speech parameters to the respective ground-truth parameters is a better indicator of subjective quality
VITON: An Image-based Virtual Try-on Network
We present an image-based VIirtual Try-On Network (VITON) without using 3D
information in any form, which seamlessly transfers a desired clothing item
onto the corresponding region of a person using a coarse-to-fine strategy.
Conditioned upon a new clothing-agnostic yet descriptive person representation,
our framework first generates a coarse synthesized image with the target
clothing item overlaid on that same person in the same pose. We further enhance
the initial blurry clothing area with a refinement network. The network is
trained to learn how much detail to utilize from the target clothing item, and
where to apply to the person in order to synthesize a photo-realistic image in
which the target item deforms naturally with clear visual patterns. Experiments
on our newly collected Zalando dataset demonstrate its promise in the
image-based virtual try-on task over state-of-the-art generative models
Visual Object Networks: Image Generation with Disentangled 3D Representation
Recent progress in deep generative models has led to tremendous breakthroughs
in image generation. However, while existing models can synthesize
photorealistic images, they lack an understanding of our underlying 3D world.
We present a new generative model, Visual Object Networks (VON), synthesizing
natural images of objects with a disentangled 3D representation. Inspired by
classic graphics rendering pipelines, we unravel our image formation process
into three conditionally independent factors---shape, viewpoint, and
texture---and present an end-to-end adversarial learning framework that jointly
models 3D shapes and 2D images. Our model first learns to synthesize 3D shapes
that are indistinguishable from real shapes. It then renders the object's 2.5D
sketches (i.e., silhouette and depth map) from its shape under a sampled
viewpoint. Finally, it learns to add realistic texture to these 2.5D sketches
to generate natural images. The VON not only generates images that are more
realistic than state-of-the-art 2D image synthesis methods, but also enables
many 3D operations such as changing the viewpoint of a generated image, editing
of shape and texture, linear interpolation in texture and shape space, and
transferring appearance across different objects and viewpoints.Comment: NeurIPS 2018. Code: https://github.com/junyanz/VON Website:
http://von.csail.mit.edu
- …