23,699 research outputs found
Disconnected Skeleton: Shape at its Absolute Scale
We present a new skeletal representation along with a matching framework to
address the deformable shape recognition problem. The disconnectedness arises
as a result of excessive regularization that we use to describe a shape at an
attainably coarse scale. Our motivation is to rely on the stable properties of
the shape instead of inaccurately measured secondary details. The new
representation does not suffer from the common instability problems of
traditional connected skeletons, and the matching process gives quite
successful results on a diverse database of 2D shapes. An important difference
of our approach from the conventional use of the skeleton is that we replace
the local coordinate frame with a global Euclidean frame supported by
additional mechanisms to handle articulations and local boundary deformations.
As a result, we can produce descriptions that are sensitive to any combination
of changes in scale, position, orientation and articulation, as well as
invariant ones.Comment: The work excluding {\S}V and {\S}VI has first appeared in 2005 ICCV:
Aslan, C., Tari, S.: An Axis-Based Representation for Recognition. In
ICCV(2005) 1339- 1346.; Aslan, C., : Disconnected Skeletons for Shape
Recognition. Masters thesis, Department of Computer Engineering, Middle East
Technical University, May 200
Metric Learning for Generalizing Spatial Relations to New Objects
Human-centered environments are rich with a wide variety of spatial relations
between everyday objects. For autonomous robots to operate effectively in such
environments, they should be able to reason about these relations and
generalize them to objects with different shapes and sizes. For example, having
learned to place a toy inside a basket, a robot should be able to generalize
this concept using a spoon and a cup. This requires a robot to have the
flexibility to learn arbitrary relations in a lifelong manner, making it
challenging for an expert to pre-program it with sufficient knowledge to do so
beforehand. In this paper, we address the problem of learning spatial relations
by introducing a novel method from the perspective of distance metric learning.
Our approach enables a robot to reason about the similarity between pairwise
spatial relations, thereby enabling it to use its previous knowledge when
presented with a new relation to imitate. We show how this makes it possible to
learn arbitrary spatial relations from non-expert users using a small number of
examples and in an interactive manner. Our extensive evaluation with real-world
data demonstrates the effectiveness of our method in reasoning about a
continuous spectrum of spatial relations and generalizing them to new objects.Comment: Accepted at the 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems. The new Freiburg Spatial Relations Dataset and a demo
video of our approach running on the PR-2 robot are available at our project
website: http://spatialrelations.cs.uni-freiburg.d
Linking Image and Text with 2-Way Nets
Linking two data sources is a basic building block in numerous computer
vision problems. Canonical Correlation Analysis (CCA) achieves this by
utilizing a linear optimizer in order to maximize the correlation between the
two views. Recent work makes use of non-linear models, including deep learning
techniques, that optimize the CCA loss in some feature space. In this paper, we
introduce a novel, bi-directional neural network architecture for the task of
matching vectors from two data sources. Our approach employs two tied neural
network channels that project the two views into a common, maximally correlated
space using the Euclidean loss. We show a direct link between the
correlation-based loss and Euclidean loss, enabling the use of Euclidean loss
for correlation maximization. To overcome common Euclidean regression
optimization problems, we modify well-known techniques to our problem,
including batch normalization and dropout. We show state of the art results on
a number of computer vision matching tasks including MNIST image matching and
sentence-image matching on the Flickr8k, Flickr30k and COCO datasets.Comment: 14 pages, 2 figures, 6 table
Fully Automatic Expression-Invariant Face Correspondence
We consider the problem of computing accurate point-to-point correspondences
among a set of human face scans with varying expressions. Our fully automatic
approach does not require any manually placed markers on the scan. Instead, the
approach learns the locations of a set of landmarks present in a database and
uses this knowledge to automatically predict the locations of these landmarks
on a newly available scan. The predicted landmarks are then used to compute
point-to-point correspondences between a template model and the newly available
scan. To accurately fit the expression of the template to the expression of the
scan, we use as template a blendshape model. Our algorithm was tested on a
database of human faces of different ethnic groups with strongly varying
expressions. Experimental results show that the obtained point-to-point
correspondence is both highly accurate and consistent for most of the tested 3D
face models
- …