1,511 research outputs found
Efficient Optimization for Rank-based Loss Functions
The accuracy of information retrieval systems is often measured using complex
loss functions such as the average precision (AP) or the normalized discounted
cumulative gain (NDCG). Given a set of positive and negative samples, the
parameters of a retrieval system can be estimated by minimizing these loss
functions. However, the non-differentiability and non-decomposability of these
loss functions does not allow for simple gradient based optimization
algorithms. This issue is generally circumvented by either optimizing a
structured hinge-loss upper bound to the loss function or by using asymptotic
methods like the direct-loss minimization framework. Yet, the high
computational complexity of loss-augmented inference, which is necessary for
both the frameworks, prohibits its use in large training data sets. To
alleviate this deficiency, we present a novel quicksort flavored algorithm for
a large class of non-decomposable loss functions. We provide a complete
characterization of the loss functions that are amenable to our algorithm, and
show that it includes both AP and NDCG based loss functions. Furthermore, we
prove that no comparison based algorithm can improve upon the computational
complexity of our approach asymptotically. We demonstrate the effectiveness of
our approach in the context of optimizing the structured hinge loss upper bound
of AP and NDCG loss for learning models for a variety of vision tasks. We show
that our approach provides significantly better results than simpler
decomposable loss functions, while requiring a comparable training time.Comment: 15 pages, 2 figure
Putting the Horse Before the Cart:A Generator-Evaluator Framework for Question Generation from Text
Automatic question generation (QG) is a useful yet challenging task in NLP.
Recent neural network-based approaches represent the state-of-the-art in this
task. In this work, we attempt to strengthen them significantly by adopting a
holistic and novel generator-evaluator framework that directly optimizes
objectives that reward semantics and structure. The {\it generator} is a
sequence-to-sequence model that incorporates the {\it structure} and {\it
semantics} of the question being generated. The generator predicts an answer in
the passage that the question can pivot on. Employing the copy and coverage
mechanisms, it also acknowledges other contextually important (and possibly
rare) keywords in the passage that the question needs to conform to, while not
redundantly repeating words. The {\it evaluator} model evaluates and assigns a
reward to each predicted question based on its conformity to the {\it
structure} of ground-truth questions. We propose two novel QG-specific reward
functions for text conformity and answer conformity of the generated question.
The evaluator also employs structure-sensitive rewards based on evaluation
measures such as BLEU, GLEU, and ROUGE-L, which are suitable for QG. In
contrast, most of the previous works only optimize the cross-entropy loss,
which can induce inconsistencies between training (objective) and testing
(evaluation) measures. Our evaluation shows that our approach significantly
outperforms state-of-the-art systems on the widely-used SQuAD benchmark as per
both automatic and human evaluation.Comment: 10 pages, The SIGNLL Conference on Computational Natural Language
Learning (CoNLL 2019
Rank-based Decomposable Losses in Machine Learning: A Survey
Recent works have revealed an essential paradigm in designing loss functions
that differentiate individual losses vs. aggregate losses. The individual loss
measures the quality of the model on a sample, while the aggregate loss
combines individual losses/scores over each training sample. Both have a common
procedure that aggregates a set of individual values to a single numerical
value. The ranking order reflects the most fundamental relation among
individual values in designing losses. In addition, decomposability, in which a
loss can be decomposed into an ensemble of individual terms, becomes a
significant property of organizing losses/scores. This survey provides a
systematic and comprehensive review of rank-based decomposable losses in
machine learning. Specifically, we provide a new taxonomy of loss functions
that follows the perspectives of aggregate loss and individual loss. We
identify the aggregator to form such losses, which are examples of set
functions. We organize the rank-based decomposable losses into eight
categories. Following these categories, we review the literature on rank-based
aggregate losses and rank-based individual losses. We describe general formulas
for these losses and connect them with existing research topics. We also
suggest future research directions spanning unexplored, remaining, and emerging
issues in rank-based decomposable losses.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI
An End-to-End Approach for Training Neural Network Binary Classifiers on Metrics Based on the Confusion Matrix
While neural network binary classifiers are often evaluated on metrics such
as Accuracy and -Score, they are commonly trained with a cross-entropy
objective. How can this training-testing gap be addressed? While specific
techniques have been adopted to optimize certain confusion matrix based
metrics, it is challenging or impossible in some cases to generalize the
techniques to other metrics. Adversarial learning approaches have also been
proposed to optimize networks via confusion matrix based metrics, but they tend
to be much slower than common training methods. In this work, we propose to
approximate the Heaviside step function, typically used to compute confusion
matrix based metrics, to render these metrics amenable to gradient descent. Our
extensive experiments show the effectiveness of our end-to-end approach for
binary classification in several domains
- …