51,643 research outputs found
Rigorously assessing software reliability and safety
This paper summarises the state of the art in the assessment of software reliability and safety ("dependability"), and describes some promising developments. A sound demonstration of very high dependability is still impossible before operation of the software; but research is finding ways to make rigorous assessment increasingly feasible. While refined mathematical techniques cannot take the place of factual knowledge, they can allow the decision-maker to draw more accurate conclusions from the knowledge that is available
Field-aware Calibration: A Simple and Empirically Strong Method for Reliable Probabilistic Predictions
It is often observed that the probabilistic predictions given by a machine
learning model can disagree with averaged actual outcomes on specific subsets
of data, which is also known as the issue of miscalibration. It is responsible
for the unreliability of practical machine learning systems. For example, in
online advertising, an ad can receive a click-through rate prediction of 0.1
over some population of users where its actual click rate is 0.15. In such
cases, the probabilistic predictions have to be fixed before the system can be
deployed.
In this paper, we first introduce a new evaluation metric named field-level
calibration error that measures the bias in predictions over the sensitive
input field that the decision-maker concerns. We show that existing post-hoc
calibration methods have limited improvements in the new field-level metric and
other non-calibration metrics such as the AUC score. To this end, we propose
Neural Calibration, a simple yet powerful post-hoc calibration method that
learns to calibrate by making full use of the field-aware information over the
validation set. We present extensive experiments on five large-scale datasets.
The results showed that Neural Calibration significantly improves against
uncalibrated predictions in common metrics such as the negative log-likelihood,
Brier score and AUC, as well as the proposed field-level calibration error.Comment: WWW 202
Learning to Rank Academic Experts in the DBLP Dataset
Expert finding is an information retrieval task that is concerned with the
search for the most knowledgeable people with respect to a specific topic, and
the search is based on documents that describe people's activities. The task
involves taking a user query as input and returning a list of people who are
sorted by their level of expertise with respect to the user query. Despite
recent interest in the area, the current state-of-the-art techniques lack in
principled approaches for optimally combining different sources of evidence.
This article proposes two frameworks for combining multiple estimators of
expertise. These estimators are derived from textual contents, from
graph-structure of the citation patterns for the community of experts, and from
profile information about the experts. More specifically, this article explores
the use of supervised learning to rank methods, as well as rank aggregation
approaches, for combing all of the estimators of expertise. Several supervised
learning algorithms, which are representative of the pointwise, pairwise and
listwise approaches, were tested, and various state-of-the-art data fusion
techniques were also explored for the rank aggregation framework. Experiments
that were performed on a dataset of academic publications from the Computer
Science domain attest the adequacy of the proposed approaches.Comment: Expert Systems, 2013. arXiv admin note: text overlap with
arXiv:1302.041
- …