Search CORE

12,270 research outputs found

Information-Theoretic Active Learning for Content-Based Image Retrieval

Author: A Freytag
A Freytag
A Genz
A Lütz
AW Smeulders
B Demir
E Rodner
IJ Cox
O Russakovsky
S Ayache
TN Cardoso
Y Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/03/2019
Field of study

We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode active learning method for binary classification, and apply it for acquiring meaningful user feedback in the context of content-based image retrieval. Instead of combining different heuristics such as uncertainty, diversity, or density, our method is based on maximizing the mutual information between the predicted relevance of the images and the expected user feedback regarding the selected batch. We propose suitable approximations to this computationally demanding problem and also integrate an explicit model of user behavior that accounts for possible incorrect labels and unnameable instances. Furthermore, our approach does not only take the structure of the data but also the expected model output change caused by the user feedback into account. In contrast to other methods, ITAL turns out to be highly flexible and provides state-of-the-art performance across various datasets, such as MIRFLICKR and ImageNet.Comment: GCPR 2018 paper (14 pages text + 2 pages references + 6 pages appendix

arXiv.org e-Print Archive

Crossref

Minimizing Supervision in Multi-label Categorization

Author: Namboodiri Vinay P.
Rajat
Singh Pravendra
Varshney Munender
Publication venue
Publication date: 26/05/2020
Field of study

Multiple categories of objects are present in most images. Treating this as a multi-class classification is not justified. We treat this as a multi-label classification problem. In this paper, we further aim to minimize the supervision required for providing supervision in multi-label classification. Specifically, we investigate an effective class of approaches that associate a weak localization with each category either in terms of the bounding box or segmentation mask. Doing so improves the accuracy of multi-label categorization. The approach we adopt is one of active learning, i.e., incrementally selecting a set of samples that need supervision based on the current model, obtaining supervision for these samples, retraining the model with the additional set of supervised samples and proceeding again to select the next set of samples. A crucial concern is the choice of the set of samples. In doing so, we provide a novel insight, and no specific measure succeeds in obtaining a consistently improved selection criterion. We, therefore, provide a selection criterion that consistently improves the overall baseline criterion by choosing the top k set of samples for a varied set of criteria. Using this criterion, we are able to show that we can retain more than 98% of the fully supervised performance with just 20% of samples (and more than 96% using 10%) of the dataset on PASCAL VOC 2007 and 2012. Also, our proposed approach consistently outperforms all other baseline metrics for all benchmark datasets and model combinations.Comment: Accepted in CVPR-W 202

arXiv.org e-Print Archive

Crossref

BatchRank: A Novel Batch Mode Active Learning Framework for Hierarchical Classification

Author: Adepu R S
Balasubramanian Vineeth N
Chakraborty S
Panchanathan S
Ye J
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Active learning algorithms automatically identify the salient and exemplar instances from large amounts of unlabeled data and thus reduce human annotation effort in inducing a classification model. More recently, Batch Mode Active Learning (BMAL) techniques have been proposed, where a batch of data samples is selected simultaneously from an un- labeled set. Most active learning algorithms assume a at label space, that is, they consider the class labels to be in- dependent. However, in many applications, the set of class labels are organized in a hierarchical tree structure, with the leaf nodes as outputs and the internal nodes as clusters of outputs at multiple levels of granularity. In this paper, we propose a novel BMAL algorithm (BatchRank) for hi- erarchical classification. The sample selection is posed as an NP-hard integer quadratic programming problem and a convex relaxation (based on linear programming) is derived, whose solution is further improved by an iterative truncated power method. Finally, a deterministic bound is established on the quality of the solution. Our empirical results on sev- eral challenging, real-world datasets from multiple domains, corroborate the potential of the proposed framework for real- world hierarchical classification applications

Research Archive of Indian Institute of Technology Hyderabad