46,587 research outputs found
Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies
In motion analysis and understanding it is important to be able to fit a
suitable model or structure to the temporal series of observed data, in order
to describe motion patterns in a compact way, and to discriminate between them.
In an unsupervised context, i.e., no prior model of the moving object(s) is
available, such a structure has to be learned from the data in a bottom-up
fashion. In recent times, volumetric approaches in which the motion is captured
from a number of cameras and a voxel-set representation of the body is built
from the camera views, have gained ground due to attractive features such as
inherent view-invariance and robustness to occlusions. Automatic, unsupervised
segmentation of moving bodies along entire sequences, in a temporally-coherent
and robust way, has the potential to provide a means of constructing a
bottom-up model of the moving body, and track motion cues that may be later
exploited for motion classification. Spectral methods such as locally linear
embedding (LLE) can be useful in this context, as they preserve "protrusions",
i.e., high-curvature regions of the 3D volume, of articulated shapes, while
improving their separation in a lower dimensional space, making them in this
way easier to cluster. In this paper we therefore propose a spectral approach
to unsupervised and temporally-coherent body-protrusion segmentation along time
sequences. Volumetric shapes are clustered in an embedding space, clusters are
propagated in time to ensure coherence, and merged or split to accommodate
changes in the body's topology. Experiments on both synthetic and real
sequences of dense voxel-set data are shown. This supports the ability of the
proposed method to cluster body-parts consistently over time in a totally
unsupervised fashion, its robustness to sampling density and shape quality, and
its potential for bottom-up model constructionComment: 31 pages, 26 figure
WarpNet: Weakly Supervised Matching for Single-view Reconstruction
We present an approach to matching images of objects in fine-grained datasets
without using part annotations, with an application to the challenging problem
of weakly supervised single-view reconstruction. This is in contrast to prior
works that require part annotations, since matching objects across class and
pose variations is challenging with appearance features alone. We overcome this
challenge through a novel deep learning architecture, WarpNet, that aligns an
object in one image with a different object in another. We exploit the
structure of the fine-grained dataset to create artificial data for training
this network in an unsupervised-discriminative learning approach. The output of
the network acts as a spatial prior that allows generalization at test time to
match real images across variations in appearance, viewpoint and articulation.
On the CUB-200-2011 dataset of bird categories, we improve the AP over an
appearance-only network by 13.6%. We further demonstrate that our WarpNet
matches, together with the structure of fine-grained datasets, allow
single-view reconstructions with quality comparable to using annotated point
correspondences.Comment: to appear in IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 201
Component-wise modeling of articulated objects
We introduce a novel framework for modeling articulated objects based on the aspects of their components. By decomposing the object into components, we divide the problem in smaller modeling tasks. After obtaining 3D models for each component aspect by employing a shape deformation paradigm, we merge them together, forming the object components. The final model is obtained by assembling the components using an optimization scheme which fits the respective 3D models to the corresponding apparent contours in a reference pose. The results suggest that our approach can produce realistic 3D models of articulated objects in reasonable time
Fine-grained sketch-based image retrieval by matching deformable part models
(c) 2014. The copyright of this document resides with its authors.
It may be distributed unchanged freely in print or electronic forms.© 2014. The copyright of this document resides with its authors. An important characteristic of sketches, compared with text, rests with their ability to intrinsically capture object appearance and structure. Nonetheless, akin to traditional text-based image retrieval, conventional sketch-based image retrieval (SBIR) principally focuses on retrieving images of the same category, neglecting the fine-grained characteristics of sketches. In this paper, we advocate the expressiveness of sketches and examine their efficacy under a novel fine-grained SBIR framework. In particular, we study how sketches enable fine-grained retrieval within object categories. Key to this problem is introducing a mid-level sketch representation that not only captures object pose, but also possesses the ability to traverse sketch and image domains. Specifically, we learn deformable part-based model (DPM) as a mid-level representation to discover and encode the various poses in sketch and image domains independently, after which graph matching is performed on DPMs to establish pose correspondences across the two domains. We further propose an SBIR dataset that covers the unique aspects of fine-grained SBIR. Through in-depth experiments, we demonstrate the superior performance of our SBIR framework, and showcase its unique ability in fine-grained retrieval
Sensitivity to speech rhythm explains individual differences in reading ability independently of phonological awareness
This study considered whether sensitivity to speech rhythm can predict concurrent variance in reading attainment after individual differences in age, vocabulary and phonological awareness have been controlled. Five to six-year-old English-speaking children completed a battery of phonological processing assessments and reading assessments, along with a simple word stress manipulation task. The results showed that performance on the stress manipulation measure predicted a significant amount of variance in reading attainment after age, vocabulary, and phonological processing had been taken into account. These results suggest that stress sensitivity is an important, yet neglected aspect of English-speaking children?s phonological representations, which needs to be incorporated into theoretical accounts of reading development
Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images
Analysis-by-synthesis has been a successful approach for many tasks in
computer vision, such as 6D pose estimation of an object in an RGB-D image
which is the topic of this work. The idea is to compare the observation with
the output of a forward process, such as a rendered image of the object of
interest in a particular pose. Due to occlusion or complicated sensor noise, it
can be difficult to perform this comparison in a meaningful way. We propose an
approach that "learns to compare", while taking these difficulties into
account. This is done by describing the posterior density of a particular
object pose with a convolutional neural network (CNN) that compares an observed
and rendered image. The network is trained with the maximum likelihood
paradigm. We observe empirically that the CNN does not specialize to the
geometry or appearance of specific objects, and it can be used with objects of
vastly different shapes and appearances, and in different backgrounds. Compared
to state-of-the-art, we demonstrate a significant improvement on two different
datasets which include a total of eleven objects, cluttered background, and
heavy occlusion.Comment: 16 pages, 8 figure
- …