30,762 research outputs found
Local Descriptors Optimized for Average Precision
Extraction of local feature descriptors is a vital stage in the solution
pipelines for numerous computer vision tasks. Learning-based approaches improve
performance in certain tasks, but still cannot replace handcrafted features in
general. In this paper, we improve the learning of local feature descriptors by
optimizing the performance of descriptor matching, which is a common stage that
follows descriptor extraction in local feature based pipelines, and can be
formulated as nearest neighbor retrieval. Specifically, we directly optimize a
ranking-based retrieval performance metric, Average Precision, using deep
neural networks. This general-purpose solution can also be viewed as a listwise
learning to rank approach, which is advantageous compared to recent local
ranking approaches. On standard benchmarks, descriptors learned with our
formulation achieve state-of-the-art results in patch verification, patch
retrieval, and image matching.Comment: 13 pages, 8 figures. IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 201
Overview of the CLEF-2005 cross-language speech retrieval track
The task for the CLEF-2005 cross-language speech retrieval track was to identify topically coherent segments of English interviews in a known-boundary condition. Seven teams participated, performing both monolingual and cross-language searches of ASR transcripts, automatically generated metadata, and manually generated metadata.
Results indicate that monolingual search technology is sufficiently accurate to be useful for some purposes (the
best mean average precision was 0.18) and cross-language searching yielded results typical of those seen in other
applications (with the best systems approximating monolingual mean average precision)
Beyond Cumulated Gain and Average Precision: Including Willingness and Expectation in the User Model
In this paper, we define a new metric family based on two concepts: The
definition of the stopping criterion and the notion of satisfaction, where the
former depends on the willingness and expectation of a user exploring search
results. Both concepts have been discussed so far in the IR literature, but we
argue in this paper that defining a proper single valued metric depends on
merging them into a single conceptual framework
Hierarchical Average Precision Training for Pertinent Image Retrieval
Image Retrieval is commonly evaluated with Average Precision (AP) or
Recall@k. Yet, those metrics, are limited to binary labels and do not take into
account errors' severity. This paper introduces a new hierarchical AP training
method for pertinent image retrieval (HAP-PIER). HAPPIER is based on a new H-AP
metric, which leverages a concept hierarchy to refine AP by integrating errors'
importance and better evaluate rankings. To train deep models with H-AP, we
carefully study the problem's structure and design a smooth lower bound
surrogate combined with a clustering loss that ensures consistent ordering.
Extensive experiments on 6 datasets show that HAPPIER significantly outperforms
state-of-the-art methods for hierarchical retrieval, while being on par with
the latest approaches when evaluating fine-grained ranking performances.
Finally, we show that HAPPIER leads to better organization of the embedding
space, and prevents most severe failure cases of non-hierarchical methods. Our
code is publicly available at: https://github.com/elias-ramzi/HAPPIER
Threshold-free Evaluation of Medical Tests for Classification and Prediction: Average Precision versus Area Under the ROC Curve
When evaluating medical tests or biomarkers for disease classification, the
area under the receiver-operating characteristic (ROC) curve is a widely used
performance metric that does not require us to commit to a specific decision
threshold. For the same type of evaluations, a different metric known as the
average precision (AP) is used much more widely in the information retrieval
literature. We study both metrics in some depths in order to elucidate their
difference and relationship. More specifically, we explain mathematically why
the AP may be more appropriate if the earlier part of the ROC curve is of
interest. We also address practical matters, deriving an expression for the
asymptotic variance of the AP, as well as providing real-world examples
concerning the evaluation of protein biomarkers for prostate cancer and the
assessment of digital versus film mammography for breast cancer screening.Comment: The first two authors contributed equally to this paper, and should
be regarded as co-first author
Uncertainty sampling for action recognition via maximizing expected average precision
© 2018 International Joint Conferences on Artificial Intelligence. All right reserved. Recognizing human actions in video clips has been an important topic in computer vision. Sufficient labeled data is one of the prerequisites for the good performance of action recognition algorithms. However, while abundant videos can be collected from the Internet, categorizing each video clip is time-consuming. Active learning is one way to alleviate the labeling labor by allowing the classifier to choose the most informative unlabeled instances for manual annotation. Among various active learning algorithms, uncertainty sampling is arguably the most widely-used strategy. Conventional uncertainty sampling strategies such as entropy-based methods are usually tested under accuracy. However, in action recognition Average Precision (AP) is an acknowledged evaluation metric, which is somehow ignored in the active learning community. It is defined as the area under the precision-recall curve. In this paper, we propose a novel uncertainty sampling algorithm for action recognition using expected AP. We conduct experiments on three real-world action recognition datasets and show that our algorithm outperforms other uncertainty-based active learning algorithms
- âŠ