348 research outputs found
A Probabilistic Theory of Supervised Similarity Learning for Pointwise ROC Curve Optimization
The performance of many machine learning techniques depends on the choice of
an appropriate similarity or distance measure on the input space. Similarity
learning (or metric learning) aims at building such a measure from training
data so that observations with the same (resp. different) label are as close
(resp. far) as possible. In this paper, similarity learning is investigated
from the perspective of pairwise bipartite ranking, where the goal is to rank
the elements of a database by decreasing order of the probability that they
share the same label with some query data point, based on the similarity
scores. A natural performance criterion in this setting is pointwise ROC
optimization: maximize the true positive rate under a fixed false positive
rate. We study this novel perspective on similarity learning through a rigorous
probabilistic framework. The empirical version of the problem gives rise to a
constrained optimization formulation involving U-statistics, for which we
derive universal learning rates as well as faster rates under a noise
assumption on the data distribution. We also address the large-scale setting by
analyzing the effect of sampling-based approximations. Our theoretical results
are supported by illustrative numerical experiments.Comment: 8 pages main paper, 22 pages with appendices, proceedings of ICML
201
A Probabilistic Theory of Supervised Similarity Learning for Pointwise ROC Curve Optimization
International audienceThe performance of many machine learning techniques depends on the choice of an appropriate similarity or distance measure on the input space. Similarity learning (or metric learning) aims at building such a measure from training data so that observations with the same (resp. different) label are as close (resp. far) as possible. In this paper, similarity learning is investigated from the perspective of pairwise bipartite ranking, where the goal is to rank the elements of a database by decreasing order of the probability that they share the same label with some query data point, based on the similarity scores. A natural performance criterion in this setting is pointwise ROC optimization: maximize the true positive rate under a fixed false positive rate. We study this novel perspective on similarity learning through a rigorous probabilistic framework. The empirical version of the problem gives rise to a constrained optimization formulation involving U-statistics, for which we derive universal learning rates as well as faster rates under a noise assumption on the data distribution. We also address the large-scale setting by analyzing the effect of sampling-based approximations. Our theoretical results are supported by illustrative numerical experiments
The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List
We are interested in supervised ranking algorithms that perform especially well near the top of the
ranked list, and are only required to perform sufficiently well on the rest of the list. In this work,
we provide a general form of convex objective that gives high-scoring examples more importance.
This “push” near the top of the list can be chosen arbitrarily large or small, based on the preference
of the user. We choose â„“p-norms to provide a specific type of push; if the user sets p larger, the
objective concentrates harder on the top of the list. We derive a generalization bound based on
the p-norm objective, working around the natural asymmetry of the problem. We then derive a
boosting-style algorithm for the problem of ranking with a push at the top. The usefulness of the
algorithm is illustrated through experiments on repository data. We prove that the minimizer of the
algorithm’s objective is unique in a specific sense. Furthermore, we illustrate how our objective is
related to quality measurements for information retrieval
Learning Fair Scoring Functions: Bipartite Ranking under ROC-based Fairness Constraints
Many applications of AI involve scoring individuals using a learned function
of their attributes. These predictive risk scores are then used to take
decisions based on whether the score exceeds a certain threshold, which may
vary depending on the context. The level of delegation granted to such systems
in critical applications like credit lending and medical diagnosis will heavily
depend on how questions of fairness can be answered. In this paper, we study
fairness for the problem of learning scoring functions from binary labeled
data, a classic learning task known as bipartite ranking. We argue that the
functional nature of the ROC curve, the gold standard measure of ranking
accuracy in this context, leads to several ways of formulating fairness
constraints. We introduce general families of fairness definitions based on the
AUC and on ROC curves, and show that our ROC-based constraints can be
instantiated such that classifiers obtained by thresholding the scoring
function satisfy classification fairness for a desired range of thresholds. We
establish generalization bounds for scoring functions learned under such
constraints, design practical learning algorithms and show the relevance our
approach with numerical experiments on real and synthetic data.Comment: 35 pages, 13 figures, 6 table
Extending Bayesian network models for mining and classification of glaucoma
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Glaucoma is a degenerative disease that damages the nerve fiber layer in the retina of the eye. Its mechanisms are not fully known and there is no fully-effective strategy to prevent visual impairment and blindness. However, if treatment is carried out at an early stage, it is possible to slow glaucomatous progression and improve the quality of life of sufferers. Despite
the great amount of heterogeneous data that has become available for monitoring glaucoma,
the performance of tests for early diagnosis are still insufficient, due to the complexity of disease progression and the diffculties in obtaining sufficient measurements. This research aims to assess and extend Bayesian Network (BN) models to investigate the nature of the disease and its progression, as well as improve early diagnosis performance. The exibility of BNs and their ability to integrate with clinician expertise make them a suitable
tool to effectively exploit the available data. After presenting the problem, a series of BN models for cross-sectional data classification and integration are assessed; novel techniques are then proposed for classification and modelling of glaucoma progression. The results are validated against literature, direct expert knowledge and other Artificial Intelligence
techniques, indicating that BNs and their proposed extensions improve glaucoma diagnosis performance and enable new insights into the disease process
A Topic Coverage Approach to Evaluation of Topic Models
Topic models are widely used unsupervised models of text capable of learning
topics - weighted lists of words and documents - from large collections of text
documents. When topic models are used for discovery of topics in text
collections, a question that arises naturally is how well the model-induced
topics correspond to topics of interest to the analyst. In this paper we
revisit and extend a so far neglected approach to topic model evaluation based
on measuring topic coverage - computationally matching model topics with a set
of reference topics that models are expected to uncover. The approach is well
suited for analyzing models' performance in topic discovery and for large-scale
analysis of both topic models and measures of model quality. We propose new
measures of coverage and evaluate, in a series of experiments, different types
of topic models on two distinct text domains for which interest for topic
discovery exists. The experiments include evaluation of model quality, analysis
of coverage of distinct topic categories, and the analysis of the relationship
between coverage and other methods of topic model evaluation. The contributions
of the paper include new measures of coverage, insights into both topic models
and other methods of model evaluation, and the datasets and code for
facilitating future research of both topic coverage and other approaches to
topic model evaluation.Comment: Results and contributions unchanged; Added new references; Improved
the contextualization and the description of the work (abstr, intro, 7.1
concl, rw, concl); Moved technical details of data and model building to
appendices; Improved layout
Efficient Deep Learning for Real-time Classification of Astronomical Transients
A new golden age in astronomy is upon us, dominated by data. Large astronomical surveys are broadcasting unprecedented rates of information, demanding machine learning as a critical component in modern scientific pipelines to handle the deluge of data. The upcoming Legacy Survey of Space and Time (LSST) of the Vera C. Rubin Observatory will raise the big-data bar for time- domain astronomy, with an expected 10 million alerts per-night, and generating many petabytes of data over the lifetime of the survey. Fast and efficient classification algorithms that can operate in real-time, yet robustly and accurately, are needed for time-critical events where additional resources can be sought for follow-up analyses. In order to handle such data, state-of-the-art deep learning architectures coupled with tools that leverage modern hardware accelerators are essential.
The work contained in this thesis seeks to address the big-data challenges of LSST by proposing novel efficient deep learning architectures for multivariate time-series classification that can provide state-of-the-art classification of astronomical transients at a fraction of the computational costs of other deep learning approaches. This thesis introduces the depthwise-separable convolution and the notion of convolutional embeddings to the task of time-series classification for gains in classification performance that are achieved with far fewer model parameters than similar methods. It also introduces the attention mechanism to time-series classification that improves performance even further still, with significant improvement in computational efficiency, as well as further reduction in model size. Finally, this thesis pioneers the use of modern model compression techniques to the field of photometric classification for efficient deep learning deployment. These insights informed the final architecture which was deployed in a live production machine learning system, demonstrating the capability to operate efficiently and robustly in real-time, at LSST scale and beyond, ready for the new era of data intensive astronomy
- …