Search CORE

2,517 research outputs found

Sketch-a-Net that Beats Humans

Author: Hospedales Timothy
Song Yi-Zhe
Xiang Tao
Yang Yongxin
Yu Qian
Publication venue
Publication date: 01/01/2015
Field of study

We propose a multi-scale multi-channel deep neural network framework that, for the first time, yields sketch recognition performance surpassing that of humans. Our superior performance is a result of explicitly embedding the unique characteristics of sketches in our model: (i) a network architecture designed for sketch rather than natural photo statistics, (ii) a multi-channel generalisation that encodes sequential ordering in the sketching process, and (iii) a multi-scale network ensemble with joint Bayesian fusion that accounts for the different levels of abstraction exhibited in free-hand sketches. We show that state-of-the-art deep networks specifically engineered for photos of natural objects fail to perform well on sketch recognition, regardless whether they are trained using photo or sketch. Our network on the other hand not only delivers the best performance on the largest human sketch dataset to date, but also is small in size making efficient training possible using just CPUs.Comment: Accepted to BMVC 2015 (oral

arXiv.org e-Print Archive

Crossref

Doodle to Search: Practical Zero-Shot Sketch-based Image Retrieval

Author: Dey Sounak
Dutta Anjan
Llados Josep
Riba Pau
Song Yi-Zhe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

In this paper, we investigate the problem of zero-shot sketch-based image retrieval (ZS-SBIR), where human sketches are used as queries to conduct retrieval of photos from unseen categories. We importantly advance prior arts by proposing a novel ZS-SBIR scenario that represents a firm step forward in its practical application. The new setting uniquely recognizes two important yet often neglected challenges of practical ZS-SBIR, (i) the large domain gap between amateur sketch and photo, and (ii) the necessity for moving towards large-scale retrieval. We first contribute to the community a novel ZS-SBIR dataset, QuickDraw-Extended, that consists of 330,000 sketches and 204,000 photos spanning across 110 categories. Highly abstract amateur human sketches are purposefully sourced to maximize the domain gap, instead of ones included in existing datasets that can often be semi-photorealistic. We then formulate a ZS-SBIR framework to jointly model sketches and photos into a common embedding space. A novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. External semantic knowledge is further embedded to aid semantic transfer. We show that, rather surprisingly, retrieval performance significantly outperforms that of state-of-the-art on existing datasets that can already be achieved using a reduced version of our model. We further demonstrate the superior performance of our full model by comparing with a number of alternatives on the newly proposed dataset. The new dataset, plus all training and testing code of our model, will be publicly released to facilitate future researchComment: Oral paper in CVPR 201

arXiv.org e-Print Archive

Crossref

University of Surrey

Open Research Exeter

Surrey Research Insight

Learning visual similarities robust to bias

Author: Thong W.
Publication venue
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

Open Cross-Domain Visual Search

Author: Mettes Pascal
Snoek Cees G. M.
Thong William
Publication venue
Publication date: 28/07/2020
Field of study

This paper addresses cross-domain visual search, where visual queries retrieve category samples from a different domain. For example, we may want to sketch an airplane and retrieve photographs of airplanes. Despite considerable progress, the search occurs in a closed setting between two pre-defined domains. In this paper, we make the step towards an open setting where multiple visual domains are available. This notably translates into a search between any pair of domains, from a combination of domains or within multiple domains. We introduce a simple -- yet effective -- approach. We formulate the search as a mapping from every visual domain to a common semantic space, where categories are represented by hyperspherical prototypes. Open cross-domain visual search is then performed by searching in the common semantic space, regardless of which domains are used as source or target. Domains are combined in the common space to search from or within multiple domains simultaneously. A separate training of every domain-specific mapping function enables an efficient scaling to any number of domains without affecting the search performance. We empirically illustrate our capability to perform open cross-domain visual search in three different scenarios. Our approach is competitive with respect to existing closed settings, where we obtain state-of-the-art results on several benchmarks for three sketch-based search tasks.Comment: Accepted at Computer Vision and Image Understanding (CVIU

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Multi-view pairwise relationship learning for sketch based 3D shape retrieval

Author: He X
Li H
Lin S
Luo X
Wang R
Wu H
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/08/2017
Field of study

© 2017 IEEE. Recent progress in sketch-based 3D shape retrieval creates a novel and user-friendly way to explore massive 3D shapes on the Internet. However, current methods on this topic rely on designing invariant features for both sketches and 3D shapes, or complex matching strategies. Therefore, they suffer from problems like arbitrary drawings and inconsistent viewpoints. To tackle this problem, we propose a probabilistic framework based on Multi-View Pairwise Relationship (MVPR) learning. Our framework includes multiple views of 3D shapes as the intermediate layer between sketches and 3D shapes, and transforms the original retrieval problem into the form of inferring pairwise relationship between sketches and views. We accomplish pairwise relationship inference by a novel MVPR net, which can automatically predict and merge the pairwise relationships between a sketch and multiple views, thus freeing us from exhaustively selecting the best view of 3D shapes. We also propose to learn robust features for sketches and views via fine-tuning pre-trained networks. Extensive experiments on a large dataset demonstrate that the proposed method can outperform state-of-the-art methods significantly

Crossref

OPUS - University of Technology Sydney

Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

Author: Li Wanqing
Ogunbona Philip
Xu Dong
Zhang Jing
Publication venue
Publication date: 01/01/2019
Field of study

This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition. Specifically, it categorises the cross-dataset recognition into seventeen problems based on a set of carefully chosen data and label attributes. Such a problem-oriented taxonomy has allowed us to examine how different transfer learning approaches tackle each problem and how well each problem has been researched to date. The comprehensive problem-oriented review of the advances in transfer learning with respect to the problem has not only revealed the challenges in transfer learning for visual recognition, but also the problems (e.g. eight of the seventeen problems) that have been scarcely studied. This survey not only presents an up-to-date technical review for researchers, but also a systematic approach and a reference for a machine learning practitioner to categorise a real problem and to look up for a possible solution accordingly

arXiv.org e-Print Archive

Research Online