21,758 research outputs found
Automatically Discovering and Learning New Visual Categories with Ranking Statistics
We tackle the problem of discovering novel classes in an image collection
given labelled examples of other classes. This setting is similar to
semi-supervised learning, but significantly harder because there are no
labelled examples for the new classes. The challenge, then, is to leverage the
information contained in the labelled images in order to learn a
general-purpose clustering model and use the latter to identify the new classes
in the unlabelled data. In this work we address this problem by combining three
ideas: (1) we suggest that the common approach of bootstrapping an image
representation using the labeled data only introduces an unwanted bias, and
that this can be avoided by using self-supervised learning to train the
representation from scratch on the union of labelled and unlabelled data; (2)
we use rank statistics to transfer the model's knowledge of the labelled
classes to the problem of clustering the unlabelled images; and, (3) we train
the data representation by optimizing a joint objective function on the
labelled and unlabelled subsets of the data, improving both the supervised
classification of the labelled data, and the clustering of the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform
current methods for novel category discovery by a significant margin.Comment: ICLR 2020, code: http://www.robots.ox.ac.uk/~vgg/research/auto_nove
Automatic Discovery, Association Estimation and Learning of Semantic Attributes for a Thousand Categories
Attribute-based recognition models, due to their impressive performance and
their ability to generalize well on novel categories, have been widely adopted
for many computer vision applications. However, usually both the attribute
vocabulary and the class-attribute associations have to be provided manually by
domain experts or large number of annotators. This is very costly and not
necessarily optimal regarding recognition performance, and most importantly, it
limits the applicability of attribute-based models to large scale data sets. To
tackle this problem, we propose an end-to-end unsupervised attribute learning
approach. We utilize online text corpora to automatically discover a salient
and discriminative vocabulary that correlates well with the human concept of
semantic attributes. Moreover, we propose a deep convolutional model to
optimize class-attribute associations with a linguistic prior that accounts for
noise and missing data in text. In a thorough evaluation on ImageNet, we
demonstrate that our model is able to efficiently discover and learn semantic
attributes at a large scale. Furthermore, we demonstrate that our model
outperforms the state-of-the-art in zero-shot learning on three data sets:
ImageNet, Animals with Attributes and aPascal/aYahoo. Finally, we enable
attribute-based learning on ImageNet and will share the attributes and
associations for future research.Comment: Accepted as a conference paper at CVPR 201
Visual Search at eBay
In this paper, we propose a novel end-to-end approach for scalable visual
search infrastructure. We discuss the challenges we faced for a massive
volatile inventory like at eBay and present our solution to overcome those. We
harness the availability of large image collection of eBay listings and
state-of-the-art deep learning techniques to perform visual search at scale.
Supervised approach for optimized search limited to top predicted categories
and also for compact binary signature are key to scale up without compromising
accuracy and precision. Both use a common deep neural network requiring only a
single forward inference. The system architecture is presented with in-depth
discussions of its basic components and optimizations for a trade-off between
search relevance and latency. This solution is currently deployed in a
distributed cloud infrastructure and fuels visual search in eBay ShopBot and
Close5. We show benchmark on ImageNet dataset on which our approach is faster
and more accurate than several unsupervised baselines. We share our learnings
with the hope that visual search becomes a first class citizen for all large
scale search engines rather than an afterthought.Comment: To appear in 23rd SIGKDD Conference on Knowledge Discovery and Data
Mining (KDD), 2017. A demonstration video can be found at
https://youtu.be/iYtjs32vh4
The contribution of data mining to information science
The information explosion is a serious challenge for current information institutions. On the other hand, data mining, which is the search for valuable information in large volumes of data, is one of the solutions to face this challenge. In the past several years, data mining has made a significant contribution to the field of information science. This paper examines the impact of data mining by reviewing existing applications, including personalized environments, electronic commerce, and search engines. For these three types of application, how data mining can enhance their functions is discussed. The reader of this paper is expected to get an overview of the state of the art research associated with these applications. Furthermore, we identify the limitations of current work and raise several directions for future research
- …