27,723 research outputs found
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
On the Design and Analysis of Multiple View Descriptors
We propose an extension of popular descriptors based on gradient orientation
histograms (HOG, computed in a single image) to multiple views. It hinges on
interpreting HOG as a conditional density in the space of sampled images, where
the effects of nuisance factors such as viewpoint and illumination are
marginalized. However, such marginalization is performed with respect to a very
coarse approximation of the underlying distribution. Our extension leverages on
the fact that multiple views of the same scene allow separating intrinsic from
nuisance variability, and thus afford better marginalization of the latter. The
result is a descriptor that has the same complexity of single-view HOG, and can
be compared in the same manner, but exploits multiple views to better trade off
insensitivity to nuisance variability with specificity to intrinsic
variability. We also introduce a novel multi-view wide-baseline matching
dataset, consisting of a mixture of real and synthetic objects with ground
truthed camera motion and dense three-dimensional geometry
Higher-order Projected Power Iterations for Scalable Multi-Matching
The matching of multiple objects (e.g. shapes or images) is a fundamental
problem in vision and graphics. In order to robustly handle ambiguities, noise
and repetitive patterns in challenging real-world settings, it is essential to
take geometric consistency between points into account. Computationally, the
multi-matching problem is difficult. It can be phrased as simultaneously
solving multiple (NP-hard) quadratic assignment problems (QAPs) that are
coupled via cycle-consistency constraints. The main limitations of existing
multi-matching methods are that they either ignore geometric consistency and
thus have limited robustness, or they are restricted to small-scale problems
due to their (relatively) high computational cost. We address these
shortcomings by introducing a Higher-order Projected Power Iteration method,
which is (i) efficient and scales to tens of thousands of points, (ii)
straightforward to implement, (iii) able to incorporate geometric consistency,
(iv) guarantees cycle-consistent multi-matchings, and (iv) comes with
theoretical convergence guarantees. Experimentally we show that our approach is
superior to existing methods
Structured learning of metric ensembles with application to person re-identification
Matching individuals across non-overlapping camera networks, known as person
re-identification, is a fundamentally challenging problem due to the large
visual appearance changes caused by variations of viewpoints, lighting, and
occlusion. Approaches in literature can be categoried into two streams: The
first stream is to develop reliable features against realistic conditions by
combining several visual features in a pre-defined way; the second stream is to
learn a metric from training data to ensure strong inter-class differences and
intra-class similarities. However, seeking an optimal combination of visual
features which is generic yet adaptive to different benchmarks is a unsoved
problem, and metric learning models easily get over-fitted due to the scarcity
of training data in person re-identification. In this paper, we propose two
effective structured learning based approaches which explore the adaptive
effects of visual features in recognizing persons in different benchmark data
sets. Our framework is built on the basis of multiple low-level visual features
with an optimal ensemble of their metrics. We formulate two optimization
algorithms, CMCtriplet and CMCstruct, which directly optimize evaluation
measures commonly used in person re-identification, also known as the
Cumulative Matching Characteristic (CMC) curve.Comment: 16 pages. Extended version of "Learning to Rank in Person
Re-Identification With Metric Ensembles", at
http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Paisitkriangkrai_Learning_to_Rank_2015_CVPR_paper.html.
arXiv admin note: text overlap with arXiv:1503.0154
- …