26,218 research outputs found
Affine Subspace Representation for Feature Description
This paper proposes a novel Affine Subspace Representation (ASR) descriptor
to deal with affine distortions induced by viewpoint changes. Unlike the
traditional local descriptors such as SIFT, ASR inherently encodes local
information of multi-view patches, making it robust to affine distortions while
maintaining a high discriminative ability. To this end, PCA is used to
represent affine-warped patches as PCA-patch vectors for its compactness and
efficiency. Then according to the subspace assumption, which implies that the
PCA-patch vectors of various affine-warped patches of the same keypoint can be
represented by a low-dimensional linear subspace, the ASR descriptor is
obtained by using a simple subspace-to-point mapping. Such a linear subspace
representation could accurately capture the underlying information of a
keypoint (local structure) under multiple views without sacrificing its
distinctiveness. To accelerate the computation of ASR descriptor, a fast
approximate algorithm is proposed by moving the most computational part (ie,
warp patch under various affine transformations) to an offline training stage.
Experimental results show that ASR is not only better than the state-of-the-art
descriptors under various image transformations, but also performs well without
a dedicated affine invariant detector when dealing with viewpoint changes.Comment: To Appear in the 2014 European Conference on Computer Visio
Recovering facial shape using a statistical model of surface normal direction
In this paper, we show how a statistical model of facial shape can be embedded within a shape-from-shading algorithm. We describe how facial shape can be captured using a statistical model of variations in surface normal direction. To construct this model, we make use of the azimuthal equidistant projection to map the distribution of surface normals from the polar representation on a unit sphere to Cartesian points on a local tangent plane. The distribution of surface normal directions is captured using the covariance matrix for the projected point positions. The eigenvectors of the covariance matrix define the modes of shape-variation in the fields of transformed surface normals. We show how this model can be trained using surface normal data acquired from range images and how to fit the model to intensity images of faces using constraints on the surface normal direction provided by Lambert's law. We demonstrate that the combination of a global statistical constraint and local irradiance constraint yields an efficient and accurate approach to facial shape recovery and is capable of recovering fine local surface details. We assess the accuracy of the technique on a variety of images with ground truth and real-world images
View Direction, Surface Orientation and Texture Orientation for Perception of Surface Shape
Textures are commonly used to enhance the representation of shape in non-photorealistic rendering applications such as medical drawings. Textures that have elongated linear elements appear to be superior to random textures in that they can, by the way they conform to the surface, reveal the surface shape. We observe that shape following hache marks commonly used in cartography and copper-plate illustration are locally similar to the effect of the lines that can be generated by the intersection of a set of parallel planes with a surface. We use this as a basis for investigating the relationships between view direction, texture orientation and surface orientation in affording surface shape perception. We report two experiments using parallel plane textures. The results show that textures constructed from planes more nearly orthogonal to the line of sight tend to be better at revealing surface shape. Also, viewing surfaces from an oblique view is much better for revealing surface shape than viewing them from directly above
Why the visual recognition system might encode the effects of illumination
A key problem in recognition is that the image of an object depends on the lighting conditions. We investigated whether recognition is sensitive to illumination using 3-D objects that were lit from either the left or right, varying both the shading and the cast shadows. In experiments 1 and 2 participants judged whether two sequentially presented objects were the same regardless of illumination. Experiment 1 used six objects that were easily discriminated and that were rendered with cast shadows. While no cost was found in sensitivity, there was a response time cost over a change in lighting direction. Experiment 2 included six additional objects that were similar to the original six objects making recognition more difficult. The objects were rendered with cast shadows, no shadows, and as a control, white shadows. With normal shadows a change in lighting direction produced costs in both sensitivity and response times. With white shadows there was a much larger cost in sensitivity and a comparable cost in response times. Without cast shadows there was no cost in either measure, but the overall performance was poorer. Experiment 3 used a naming task in which names were assigned to six objects rendered with cast shadows. Participants practised identifying the objects in two viewpoints lit from a single lighting direction. Viewpoint and illumination invariance were then tested over new viewpoints and illuminations. Costs in both sensitivity and response time were found for naming the familiar objects in unfamiliar lighting directions regardless of whether the viewpoint was familiar or unfamiliar. Together these results suggest that illumination effects such as shadow edges: (1) affect visual memory; (2) serve the function of making unambigous the three-dimensional shape
Recommended from our members
A biologically inspired spiking model of visual processing for image feature detection
To enable fast reliable feature matching or tracking in scenes, features need to be discrete and meaningful, and hence edge or corner features, commonly called interest points are often used for this purpose. Experimental research has illustrated that biological vision systems use neuronal circuits to extract particular features such as edges or corners from visual scenes. Inspired by this biological behaviour, this paper proposes a biologically inspired spiking neural network for the purpose of image feature extraction. Standard digital images are processed and converted to spikes in a manner similar to the processing that transforms light into spikes in the retina. Using a hierarchical spiking network, various types of biologically inspired receptive fields are used to extract progressively complex image features. The performance of the network is assessed by examining the repeatability of extracted features with visual results presented using both synthetic and real images
Evaluation of Object Detection Proposals Under Condition Variations
Object detection is a fundamental task in many computer vision applications,
therefore the importance of evaluating the quality of object detection is well
acknowledged in this domain. This process gives insight into the capabilities
of methods in handling environmental changes. In this paper, a new method for
object detection is introduced that combines the Selective Search and
EdgeBoxes. We tested these three methods under environmental variations. Our
experiments demonstrate the outperformance of the combination method under
illumination and view point variations.Comment: 2 pages, 6 figures, CVPR Workshop, 201
Hallucinating optimal high-dimensional subspaces
Linear subspace representations of appearance variation are pervasive in
computer vision. This paper addresses the problem of robustly matching such
subspaces (computing the similarity between them) when they are used to
describe the scope of variations within sets of images of different (possibly
greatly so) scales. A naive solution of projecting the low-scale subspace into
the high-scale image space is described first and subsequently shown to be
inadequate, especially at large scale discrepancies. A successful approach is
proposed instead. It consists of (i) an interpolated projection of the
low-scale subspace into the high-scale space, which is followed by (ii) a
rotation of this initial estimate within the bounds of the imposed
``downsampling constraint''. The optimal rotation is found in the closed-form
which best aligns the high-scale reconstruction of the low-scale subspace with
the reference it is compared to. The method is evaluated on the problem of
matching sets of (i) face appearances under varying illumination and (ii)
object appearances under varying viewpoint, using two large data sets. In
comparison to the naive matching, the proposed algorithm is shown to greatly
increase the separation of between-class and within-class similarities, as well
as produce far more meaningful modes of common appearance on which the match
score is based.Comment: Pattern Recognition, 201
On the Design and Analysis of Multiple View Descriptors
We propose an extension of popular descriptors based on gradient orientation
histograms (HOG, computed in a single image) to multiple views. It hinges on
interpreting HOG as a conditional density in the space of sampled images, where
the effects of nuisance factors such as viewpoint and illumination are
marginalized. However, such marginalization is performed with respect to a very
coarse approximation of the underlying distribution. Our extension leverages on
the fact that multiple views of the same scene allow separating intrinsic from
nuisance variability, and thus afford better marginalization of the latter. The
result is a descriptor that has the same complexity of single-view HOG, and can
be compared in the same manner, but exploits multiple views to better trade off
insensitivity to nuisance variability with specificity to intrinsic
variability. We also introduce a novel multi-view wide-baseline matching
dataset, consisting of a mixture of real and synthetic objects with ground
truthed camera motion and dense three-dimensional geometry
Variational Uncalibrated Photometric Stereo under General Lighting
Photometric stereo (PS) techniques nowadays remain constrained to an ideal
laboratory setup where modeling and calibration of lighting is amenable. To
eliminate such restrictions, we propose an efficient principled variational
approach to uncalibrated PS under general illumination. To this end, the
Lambertian reflectance model is approximated through a spherical harmonic
expansion, which preserves the spatial invariance of the lighting. The joint
recovery of shape, reflectance and illumination is then formulated as a single
variational problem. There the shape estimation is carried out directly in
terms of the underlying perspective depth map, thus implicitly ensuring
integrability and bypassing the need for a subsequent normal integration. To
tackle the resulting nonconvex problem numerically, we undertake a two-phase
procedure to initialize a balloon-like perspective depth map, followed by a
"lagged" block coordinate descent scheme. The experiments validate efficiency
and robustness of this approach. Across a variety of evaluations, we are able
to reduce the mean angular error consistently by a factor of 2-3 compared to
the state-of-the-art.Comment: Haefner and Ye contributed equall
- …