623 research outputs found
Web-Scale Training for Face Identification
Scaling machine learning methods to very large datasets has attracted
considerable attention in recent years, thanks to easy access to ubiquitous
sensing and data from the web. We study face recognition and show that three
distinct properties have surprising effects on the transferability of deep
convolutional networks (CNN): (1) The bottleneck of the network serves as an
important transfer learning regularizer, and (2) in contrast to the common
wisdom, performance saturation may exist in CNN's (as the number of training
samples grows); we propose a solution for alleviating this by replacing the
naive random subsampling of the training set with a bootstrapping process.
Moreover, (3) we find a link between the representation norm and the ability to
discriminate in a target domain, which sheds lights on how such networks
represent faces. Based on these discoveries, we are able to improve face
recognition accuracy on the widely used LFW benchmark, both in the verification
(1:1) and identification (1:N) protocols, and directly compare, for the first
time, with the state of the art Commercially-Off-The-Shelf system and show a
sizable leap in performance
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Leveraging Overhead Imagery for Localization, Mapping, and Understanding
Ground-level and overhead images provide complementary viewpoints of the world. This thesis proposes methods which leverage dense overhead imagery, in addition to sparsely distributed ground-level imagery, to advance traditional computer vision problems, such as ground-level image localization and fine-grained urban mapping. Our work focuses on three primary research areas: learning a joint feature representation between ground-level and overhead imagery to enable direct comparison for the task of image geolocalization, incorporating unlabeled overhead images by inferring labels from nearby ground-level images to improve image-driven mapping, and fusing ground-level imagery with overhead imagery to enhance understanding. The ultimate contribution of this thesis is a general framework for estimating geospatial functions, such as land cover or land use, which integrates visual evidence from both ground-level and overhead image viewpoints
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Inspired by the fact that human brains can emphasize discriminative parts of
the input and suppress irrelevant ones, substantial local mechanisms have been
designed to boost the development of computer vision. They can not only focus
on target parts to learn discriminative local representations, but also process
information selectively to improve the efficiency. In terms of application
scenarios and paradigms, local mechanisms have different characteristics. In
this survey, we provide a systematic review of local mechanisms for various
computer vision tasks and approaches, including fine-grained visual
recognition, person re-identification, few-/zero-shot learning, multi-modal
learning, self-supervised learning, Vision Transformers, and so on.
Categorization of local mechanisms in each field is summarized. Then,
advantages and disadvantages for every category are analyzed deeply, leaving
room for exploration. Finally, future research directions about local
mechanisms have also been discussed that may benefit future works. To the best
our knowledge, this is the first survey about local mechanisms on computer
vision. We hope that this survey can shed light on future research in the
computer vision field
- …