472 research outputs found
Shape Interaction Matrix Revisited and Robustified: Efficient Subspace Clustering with Corrupted and Incomplete Data
The Shape Interaction Matrix (SIM) is one of the earliest approaches to
performing subspace clustering (i.e., separating points drawn from a union of
subspaces). In this paper, we revisit the SIM and reveal its connections to
several recent subspace clustering methods. Our analysis lets us derive a
simple, yet effective algorithm to robustify the SIM and make it applicable to
realistic scenarios where the data is corrupted by noise. We justify our method
by intuitive examples and the matrix perturbation theory. We then show how this
approach can be extended to handle missing data, thus yielding an efficient and
general subspace clustering algorithm. We demonstrate the benefits of our
approach over state-of-the-art subspace clustering methods on several
challenging motion segmentation and face clustering problems, where the data
includes corrupted and missing measurements.Comment: This is an extended version of our iccv15 pape
Multi-body Non-rigid Structure-from-Motion
Conventional structure-from-motion (SFM) research is primarily concerned with
the 3D reconstruction of a single, rigidly moving object seen by a static
camera, or a static and rigid scene observed by a moving camera --in both cases
there are only one relative rigid motion involved. Recent progress have
extended SFM to the areas of {multi-body SFM} (where there are {multiple rigid}
relative motions in the scene), as well as {non-rigid SFM} (where there is a
single non-rigid, deformable object or scene). Along this line of thinking,
there is apparently a missing gap of "multi-body non-rigid SFM", in which the
task would be to jointly reconstruct and segment multiple 3D structures of the
multiple, non-rigid objects or deformable scenes from images. Such a multi-body
non-rigid scenario is common in reality (e.g. two persons shaking hands,
multi-person social event), and how to solve it represents a natural
{next-step} in SFM research. By leveraging recent results of subspace
clustering, this paper proposes, for the first time, an effective framework for
multi-body NRSFM, which simultaneously reconstructs and segments each 3D
trajectory into their respective low-dimensional subspace. Under our
formulation, 3D trajectories for each non-rigid structure can be well
approximated with a sparse affine combination of other 3D trajectories from the
same structure (self-expressiveness). We solve the resultant optimization with
the alternating direction method of multipliers (ADMM). We demonstrate the
efficacy of the proposed framework through extensive experiments on both
synthetic and real data sequences. Our method clearly outperforms other
alternative methods, such as first clustering the 2D feature tracks to groups
and then doing non-rigid reconstruction in each group or first conducting 3D
reconstruction by using single subspace assumption and then clustering the 3D
trajectories into groups.Comment: 21 pages, 16 figure
Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization
This paper tackles the problem of large-scale image-based localization (IBL)
where the spatial location of a query image is determined by finding out the
most similar reference images in a large database. For solving this problem, a
critical task is to learn discriminative image representation that captures
informative information relevant for localization. We propose a novel
representation learning method having higher location-discriminating power. It
provides the following contributions: 1) we represent a place (location) as a
set of exemplar images depicting the same landmarks and aim to maximize
similarities among intra-place images while minimizing similarities among
inter-place images; 2) we model a similarity measure as a probability
distribution on L_2-metric distances between intra-place and inter-place image
representations; 3) we propose a new Stochastic Attraction and Repulsion
Embedding (SARE) loss function minimizing the KL divergence between the learned
and the actual probability distributions; 4) we give theoretical comparisons
between SARE, triplet ranking and contrastive losses. It provides insights into
why SARE is better by analyzing gradients. Our SARE loss is easy to implement
and pluggable to any CNN. Experiments show that our proposed method improves
the localization performance on standard benchmarks by a large margin.
Demonstrating the broad applicability of our method, we obtained the third
place out of 209 teams in the 2018 Google Landmark Retrieval Challenge. Our
code and model are available at https://github.com/Liumouliu/deepIBL.Comment: ICC
- …