2,517 research outputs found
Sketch-a-Net that Beats Humans
We propose a multi-scale multi-channel deep neural network framework that,
for the first time, yields sketch recognition performance surpassing that of
humans. Our superior performance is a result of explicitly embedding the unique
characteristics of sketches in our model: (i) a network architecture designed
for sketch rather than natural photo statistics, (ii) a multi-channel
generalisation that encodes sequential ordering in the sketching process, and
(iii) a multi-scale network ensemble with joint Bayesian fusion that accounts
for the different levels of abstraction exhibited in free-hand sketches. We
show that state-of-the-art deep networks specifically engineered for photos of
natural objects fail to perform well on sketch recognition, regardless whether
they are trained using photo or sketch. Our network on the other hand not only
delivers the best performance on the largest human sketch dataset to date, but
also is small in size making efficient training possible using just CPUs.Comment: Accepted to BMVC 2015 (oral
Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval
In this paper, we investigate the problem of zero-shot sketch-based image
retrieval (ZS-SBIR), where human sketches are used as queries to conduct
retrieval of photos from unseen categories. We importantly advance prior arts
by proposing a novel ZS-SBIR scenario that represents a firm step forward in
its practical application. The new setting uniquely recognizes two important
yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap
between amateur sketch and photo, and (ii) the necessity for moving towards
large-scale retrieval. We first contribute to the community a novel ZS-SBIR
dataset, QuickDraw-Extended, that consists of 330,000 sketches and 204,000
photos spanning across 110 categories. Highly abstract amateur human sketches
are purposefully sourced to maximize the domain gap, instead of ones included
in existing datasets that can often be semi-photorealistic. We then formulate a
ZS-SBIR framework to jointly model sketches and photos into a common embedding
space. A novel strategy to mine the mutual information among domains is
specifically engineered to alleviate the domain gap. External semantic
knowledge is further embedded to aid semantic transfer. We show that, rather
surprisingly, retrieval performance significantly outperforms that of
state-of-the-art on existing datasets that can already be achieved using a
reduced version of our model. We further demonstrate the superior performance
of our full model by comparing with a number of alternatives on the newly
proposed dataset. The new dataset, plus all training and testing code of our
model, will be publicly released to facilitate future researchComment: Oral paper in CVPR 201
Open Cross-Domain Visual Search
This paper addresses cross-domain visual search, where visual queries
retrieve category samples from a different domain. For example, we may want to
sketch an airplane and retrieve photographs of airplanes. Despite considerable
progress, the search occurs in a closed setting between two pre-defined
domains. In this paper, we make the step towards an open setting where multiple
visual domains are available. This notably translates into a search between any
pair of domains, from a combination of domains or within multiple domains. We
introduce a simple -- yet effective -- approach. We formulate the search as a
mapping from every visual domain to a common semantic space, where categories
are represented by hyperspherical prototypes. Open cross-domain visual search
is then performed by searching in the common semantic space, regardless of
which domains are used as source or target. Domains are combined in the common
space to search from or within multiple domains simultaneously. A separate
training of every domain-specific mapping function enables an efficient scaling
to any number of domains without affecting the search performance. We
empirically illustrate our capability to perform open cross-domain visual
search in three different scenarios. Our approach is competitive with respect
to existing closed settings, where we obtain state-of-the-art results on
several benchmarks for three sketch-based search tasks.Comment: Accepted at Computer Vision and Image Understanding (CVIU
Multi-view pairwise relationship learning for sketch based 3D shape retrieval
© 2017 IEEE. Recent progress in sketch-based 3D shape retrieval creates a novel and user-friendly way to explore massive 3D shapes on the Internet. However, current methods on this topic rely on designing invariant features for both sketches and 3D shapes, or complex matching strategies. Therefore, they suffer from problems like arbitrary drawings and inconsistent viewpoints. To tackle this problem, we propose a probabilistic framework based on Multi-View Pairwise Relationship (MVPR) learning. Our framework includes multiple views of 3D shapes as the intermediate layer between sketches and 3D shapes, and transforms the original retrieval problem into the form of inferring pairwise relationship between sketches and views. We accomplish pairwise relationship inference by a novel MVPR net, which can automatically predict and merge the pairwise relationships between a sketch and multiple views, thus freeing us from exhaustively selecting the best view of 3D shapes. We also propose to learn robust features for sketches and views via fine-tuning pre-trained networks. Extensive experiments on a large dataset demonstrate that the proposed method can outperform state-of-the-art methods significantly
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
- …