5,887 research outputs found
A Theoretical Analysis of NDCG Type Ranking Measures
A central problem in ranking is to design a ranking measure for evaluation of
ranking functions. In this paper we study, from a theoretical perspective, the
widely used Normalized Discounted Cumulative Gain (NDCG)-type ranking measures.
Although there are extensive empirical studies of NDCG, little is known about
its theoretical properties. We first show that, whatever the ranking function
is, the standard NDCG which adopts a logarithmic discount, converges to 1 as
the number of items to rank goes to infinity. On the first sight, this result
is very surprising. It seems to imply that NDCG cannot differentiate good and
bad ranking functions, contradicting to the empirical success of NDCG in many
applications. In order to have a deeper understanding of ranking measures in
general, we propose a notion referred to as consistent distinguishability. This
notion captures the intuition that a ranking measure should have such a
property: For every pair of substantially different ranking functions, the
ranking measure can decide which one is better in a consistent manner on almost
all datasets. We show that NDCG with logarithmic discount has consistent
distinguishability although it converges to the same limit for all ranking
functions. We next characterize the set of all feasible discount functions for
NDCG according to the concept of consistent distinguishability. Specifically we
show that whether NDCG has consistent distinguishability depends on how fast
the discount decays, and 1/r is a critical point. We then turn to the cut-off
version of NDCG, i.e., NDCG@k. We analyze the distinguishability of NDCG@k for
various choices of k and the discount functions. Experimental results on real
Web search datasets agree well with the theory.Comment: COLT 201
Recommended from our members
Deep learning for cardiac image segmentation: A review
Deep learning has become the most widely used approach for cardiac image segmentation in recent years. In this paper, we provide a review of over 100 cardiac image segmentation papers using deep learning, which covers common imaging modalities including magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound (US) and major anatomical structures of interest (ventricles, atria and vessels). In addition, a summary of publicly available cardiac image datasets and code repositories are included to provide a base for encouraging reproducible research. Finally, we discuss the challenges and limitations with current deep learning-based approaches (scarcity of labels, model generalizability across different domains, interpretability) and suggest potential directions for future research
Convex Calibration Dimension for Multiclass Loss Matrices
We study consistency properties of surrogate loss functions for general
multiclass learning problems, defined by a general multiclass loss matrix. We
extend the notion of classification calibration, which has been studied for
binary and multiclass 0-1 classification problems (and for certain other
specific learning problems), to the general multiclass setting, and derive
necessary and sufficient conditions for a surrogate loss to be calibrated with
respect to a loss matrix in this setting. We then introduce the notion of
convex calibration dimension of a multiclass loss matrix, which measures the
smallest `size' of a prediction space in which it is possible to design a
convex surrogate that is calibrated with respect to the loss matrix. We derive
both upper and lower bounds on this quantity, and use these results to analyze
various loss matrices. In particular, we apply our framework to study various
subset ranking losses, and use the convex calibration dimension as a tool to
show both the existence and non-existence of various types of convex calibrated
surrogates for these losses. Our results strengthen recent results of Duchi et
al. (2010) and Calauzenes et al. (2012) on the non-existence of certain types
of convex calibrated surrogates in subset ranking. We anticipate the convex
calibration dimension may prove to be a useful tool in the study and design of
surrogate losses for general multiclass learning problems.Comment: Accepted to JMLR, pending editin
Designing High-Performing Networks for Multi-Scale Computer Vision
Since the emergence of deep learning, the computer vision field has
flourished with models improving at a rapid pace on more and more complex
tasks. We distinguish three main ways to improve a computer vision model: (1)
improving the data aspect by for example training on a large, more diverse
dataset, (2) improving the training aspect by for example designing a better
optimizer, and (3) improving the network architecture (or network for short).
In this thesis, we chose to improve the latter, i.e. improving the network
designs of computer vision models. More specifically, we investigate new
network designs for multi-scale computer vision tasks, which are tasks
requiring to make predictions about concepts at different scales. The goal of
these new network designs is to outperform existing baseline designs from the
literature. Specific care is taken to make sure the comparisons are fair, by
guaranteeing that the different network designs were trained and evaluated with
the same settings. Code is publicly available at
https://github.com/CedricPicron/DetSeg.Comment: PhD thesi
Building and use of objective tests in high school economics
Thesis (M.B.A.)--Boston University
- …