16 research outputs found

    Bayes-Optimal Scorers for Bipartite Ranking

    Get PDF

    Bipartite Ranking: a Risk-Theoretic Perspective

    Get PDF
    We present a systematic study of the bipartite ranking problem, with the aim of explicating its connections to the class-probability estimation problem. Our study focuses on the properties of the statistical risk for bipartite ranking with general losses, which is closely related to a generalised notion of the area under the ROC curve: we establish alternate representations of this risk, relate the Bayes-optimal risk to a class of probability divergences, and characterise the set of Bayes-optimal scorers for the risk. We further study properties of a generalised class of bipartite risks, based on the p-norm push of Rudin (2009). Our analysis is based on the rich framework of proper losses, which are the central tool in the study of class-probability estimation. We show how this analytic tool makes transparent the generalisations of several existing results, such as the equivalence of the minimisers for four seemingly disparate risks from bipartite ranking and class-probability estimation. A novel practical implication of our analysis is the design of new families of losses for scenarios where accuracy at the head of ranked list is paramount, with comparable empirical performance to the p-norm push

    In Defense of Softmax Parametrization for Calibrated and Consistent Learning to Defer

    Full text link
    Enabling machine learning classifiers to defer their decision to a downstream expert when the expert is more accurate will ensure improved safety and performance. This objective can be achieved with the learning-to-defer framework which aims to jointly learn how to classify and how to defer to the expert. In recent studies, it has been theoretically shown that popular estimators for learning to defer parameterized with softmax provide unbounded estimates for the likelihood of deferring which makes them uncalibrated. However, it remains unknown whether this is due to the widely used softmax parameterization and if we can find a softmax-based estimator that is both statistically consistent and possesses a valid probability estimator. In this work, we first show that the cause of the miscalibrated and unbounded estimator in prior literature is due to the symmetric nature of the surrogate losses used and not due to softmax. We then propose a novel statistically consistent asymmetric softmax-based surrogate loss that can produce valid estimates without the issue of unboundedness. We further analyze the non-asymptotic properties of our method and empirically validate its performance and calibration on benchmark datasets.Comment: NeurIPS 202

    Using Machine Learning Techniques to Predict a Risk Score for New Members of a Chit Fund Group

    Get PDF
    Predicting the risk score of new and potential customers is used across the financial industry. By implementing the prediction of risk scores for their customers a chit fund company can improve the knowledge and customer understanding without relying on human knowledge. Data is collected on each customer before they have taken out credit and during the time they contribute to a chit fund. Having collected the necessary data, the company can then decide whether modelling customer risk would benefit them. As the data is available historically, one aspect of risk score prediction will be the focus of this thesis, supervised machine learning. Supervised machine learning techniques use historic data to ‘learn a model of the relationship between a set of descriptive features and a target feature’ (Kelleher, Mac Namee, & D’Arcy, 2015). There are many supervised machine learning techniques; support vector machine (SVM), logistic regression and decision trees will be the focal point of this thesis. The main objective of this project attempts to predict a risk score for new or potential subscribers of a chit fund company. The models generated would be suitable for use before a customer joins a chit fund group as well as while the customer is taking part in the group, measuring risk before becoming a subscriber and the behavioural risk while with the company. The objective is to extend research already carried out to predict a score from zero to one identifying the probability of default. Default, for the purpose of this project, is defined as being more than 90 days late with a payment. The data of real chit fund subscribers was used to train and test the models built for the project. A factor reduction technique was used to identify key variables, and multiple models were tested to determine which gives the best results. The second objective of this project will look at the subscriber network. This section of the project will check for links between subscribers, and investigate a possible link between subscribers and their chance of default. Variables such as address and nominee will be the focus in this section. iii The most successful supervised machine learning model was the random forest model with precision of 59% and recall of 92%. Accuracy for this model was the highest of each of the models in the experiment at 85%. However, this is not the most trustworthy evaluation measure for this project as the dataset is unbalanced. A combination of 300 decision trees were applied in this model. Using the classification method, the class that was predicted by the majority of trees was selected as the final prediction. This achieved high accuracy of the dataset from the chit fund company, Kyepot. Social network analysis found that there was no unusual relationship between subscribers that went into default with regards to the area in which they live or their nominees. Supervised machine learning techniques have been shown to be a useful tool in the financial industry. This project suggests that these techniques may also be useful tools for chit fund companies. This project evaluates four different techniques suggesting the random forest technique is the most useful for this chit fund company

    Learning from Corrupted Binary Labels via Class-Probability Estimation

    Get PDF
    Abstract Many supervised learning problems involve learning from samples whose labels are corrupted in some way. For example, each label may be flipped with some constant probability (learning with label noise), or one may have a pool of unlabelled samples in lieu of negative samples (learning from positive and unlabelled data). This paper uses class-probability estimation to study these and other corruption processes belonging to the mutually contaminated distributions framework Learning from corrupted binary labels In many practical scenarios involving learning from binary labels, one observes samples whose labels are corrupted versions of the actual ground truth. For example, in learning from class-conditional label noise (CCN learning), the labels are flipped with some constant probability A fundamental question is whether one can minimise a given performance measure with respect to D, given access only to samples from D corr . Intuitively, in general this requires knowledge of the parameters of the corruption process that determines D corr . This yields two further questions: are there measures for which knowledge of these corruption parameters is unnecessary, and for other measures, can we estimate these parameters? In this paper, we consider corruption problems belonging to the mutually contaminated distributions framework While some of our results are known for the special cases of CCN and PU learning, our interest is in determining to what extent they generalise to other label corruption problems. This is a step towards a unified treatment of these problems. We now fix notation and formalise the problem

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    The Dynamics of Social Hierarchy

    Get PDF
    A growing body of research has outlined that humans gain social rank through two pathways: prestige and dominance. This dual model of social hierarchy advocates that individuals either attain positions of high rank though signals of an ability and willingness to either inflict harm (dominance) or confer benefits (prestige) to group members. While there is growing support for the dual model of social hierarchy, the extant empirical evidence has been cross-sectional and has neglected the impact that time and context has on the efficacy of prestige and dominance as long-term processes. The present research outlines a theoretical framework for the trajectories of prestige, dominance and social rank over time, and further provides longitudinal evidence of their temporal dynamics. In addition, the current research tests the longitudinal associations that prestige and dominance have with social networks, Results of study 1 suggest that, in collaborative task groups, prestige has a positive and bidirectional temporal association with social rank, while the association that dominance has diminished over time. Study 2 indicated that in these task groups those high in prestige were more likely to be asked advice and prestige was transmitted through advice ties but had a limited association with friendship. Those high in dominance were less likely to be nominated as friends, but dominance was transmitted through friendship ties. Results from Study 3 suggest that those high in prestige status were more likely to aid in food sharing and food production, and that the prestige status of an individual’s food sharing and food production partners increased their prestige status over a period of twelve years among the Tsimane forager-horticulturalists of Bolivia. Overall, the present research highlights the distinction between prestige and dominance over time and shows that prestige, dominance, social rank and social networks have bidirectional, dynamic relationships over time

    Proceedings of the 2021 Symposium on Information Theory and Signal Processing in the Benelux, May 20-21, TU Eindhoven

    Get PDF
    corecore