93 research outputs found

    The Target-Based Utility Model. The role of Copulas and of Non-Additive Measures

    Get PDF
    My studies and my Ph.D. thesis deal with topics that recently emerged in the field of decisions under risk and uncertainty. In particular, I deal with the "target-based approach" to utility theory. A rich literature has been devoted in the last decade to this approach to economic decisions: originally, interest had been focused on the "single-attribute" case and, more recently, extensions to "multi-attribute" case have been studied. This literature is still growing, with a main focus on applied aspects. I will, on the contrary, focus attention on some aspects of theoretical type, related with the multi-attribute case. Various mathematical concepts, such as non-additive measures, aggregation functions, multivariate probability distributions, and notions of stochastic dependence emerge in the formulation and the analysis of target-based models. Notions in the field of non-additive measures and aggregation functions are quite common in the modern economic literature. They have been used to go beyond the classical principle of maximization of expected utility in decision theory. These notions, furthermore, are used in game theory and multi-criteria decision aid. Along my work, on the contrary, I show how non-additive measures and aggregation functions emerge in a natural way in the frame of the target-based approach to classical utility theory, when considering the multi-attribute case. Furthermore they combine with the analysis of multivariate probability distributions and with concepts of stochastic dependence. The concept of copula also constitutes a very important tool for this work, mainly for two purposes. The first one is linked to the analysis of target-based utilities, the other one is in the comparison between classical stochastic order and the concept of "stochastic precedence". This topic finds its application in statistics as well as in the study of Markov Models linked to waiting times to occurrences of words in random sampling of letters from an alphabet. In this work I give a generalization of the concept of stochastic precedence and we discuss its properties on the basis of properties of the connecting copulas of the variables. Along this work I also trace connections to reliability theory, whose aim is studying the lifetime of a system through the analysis of the lifetime of its components. The target-based model finds an application in representing the behavior of the whole system by means of the interaction of its components

    TRIAGE: Characterizing and auditing training data for improved regression

    Full text link
    Data quality is crucial for robust machine learning algorithms, with the recent interest in data-centric AI emphasizing the importance of training data characterization. However, current data characterization methods are largely focused on classification settings, with regression settings largely understudied. To address this, we introduce TRIAGE, a novel data characterization framework tailored to regression tasks and compatible with a broad class of regressors. TRIAGE utilizes conformal predictive distributions to provide a model-agnostic scoring method, the TRIAGE score. We operationalize the score to analyze individual samples' training dynamics and characterize samples as under-, over-, or well-estimated by the model. We show that TRIAGE's characterization is consistent and highlight its utility to improve performance via data sculpting/filtering, in multiple regression settings. Additionally, beyond sample level, we show TRIAGE enables new approaches to dataset selection and feature acquisition. Overall, TRIAGE highlights the value unlocked by data characterization in real-world regression applicationsComment: Presented at NeurIPS 202

    The Target-Based Utility Model. The role of Copulas and of Non-Additive Measures

    Get PDF
    My studies and my Ph.D. thesis deal with topics that recently emerged in the field of decisions under risk and uncertainty. In particular, I deal with the "target-based approach" to utility theory. A rich literature has been devoted in the last decade to this approach to economic decisions: originally, interest had been focused on the "single-attribute" case and, more recently, extensions to "multi-attribute" case have been studied. This literature is still growing, with a main focus on applied aspects. I will, on the contrary, focus attention on some aspects of theoretical type, related with the multi-attribute case. Various mathematical concepts, such as non-additive measures, aggregation functions, multivariate probability distributions, and notions of stochastic dependence emerge in the formulation and the analysis of target-based models. Notions in the field of non-additive measures and aggregation functions are quite common in the modern economic literature. They have been used to go beyond the classical principle of maximization of expected utility in decision theory. These notions, furthermore, are used in game theory and multi-criteria decision aid. Along my work, on the contrary, I show how non-additive measures and aggregation functions emerge in a natural way in the frame of the target-based approach to classical utility theory, when considering the multi-attribute case. Furthermore they combine with the analysis of multivariate probability distributions and with concepts of stochastic dependence. The concept of copula also constitutes a very important tool for this work, mainly for two purposes. The first one is linked to the analysis of target-based utilities, the other one is in the comparison between classical stochastic order and the concept of "stochastic precedence". This topic finds its application in statistics as well as in the study of Markov Models linked to waiting times to occurrences of words in random sampling of letters from an alphabet. In this work I give a generalization of the concept of stochastic precedence and we discuss its properties on the basis of properties of the connecting copulas of the variables. Along this work I also trace connections to reliability theory, whose aim is studying the lifetime of a system through the analysis of the lifetime of its components. The target-based model finds an application in representing the behavior of the whole system by means of the interaction of its components

    Marginal equivalence in v-spherical models

    Get PDF
    Bayesian Statistics;Models

    Colouring and breaking sticks: random distributions and heterogeneous clustering

    Full text link
    We begin by reviewing some probabilistic results about the Dirichlet Process and its close relatives, focussing on their implications for statistical modelling and analysis. We then introduce a class of simple mixture models in which clusters are of different `colours', with statistical characteristics that are constant within colours, but different between colours. Thus cluster identities are exchangeable only within colours. The basic form of our model is a variant on the familiar Dirichlet process, and we find that much of the standard modelling and computational machinery associated with the Dirichlet process may be readily adapted to our generalisation. The methodology is illustrated with an application to the partially-parametric clustering of gene expression profiles.Comment: 26 pages, 3 figures. Chapter 13 of "Probability and Mathematical Genetics: Papers in Honour of Sir John Kingman" (Editors N.H. Bingham and C.M. Goldie), Cambridge University Press, 201

    A survey of random processes with reinforcement

    Full text link
    The models surveyed include generalized P\'{o}lya urns, reinforced random walks, interacting urn models, and continuous reinforced processes. Emphasis is on methods and results, with sketches provided of some proofs. Applications are discussed in statistics, biology, economics and a number of other areas.Comment: Published at http://dx.doi.org/10.1214/07-PS094 in the Probability Surveys (http://www.i-journals.org/ps/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Bayesian nonparametric analysis of reversible Markov chains

    Get PDF
    We introduce a three-parameter random walk with reinforcement, called the (θ,α,β)(\theta,\alpha,\beta) scheme, which generalizes the linearly edge reinforced random walk to uncountable spaces. The parameter β\beta smoothly tunes the (θ,α,β)(\theta,\alpha,\beta) scheme between this edge reinforced random walk and the classical exchangeable two-parameter Hoppe urn scheme, while the parameters α\alpha and θ\theta modulate how many states are typically visited. Resorting to de Finetti's theorem for Markov chains, we use the (θ,α,β)(\theta,\alpha,\beta) scheme to define a nonparametric prior for Bayesian analysis of reversible Markov chains. The prior is applied in Bayesian nonparametric inference for species sampling problems with data generated from a reversible Markov chain with an unknown transition kernel. As a real example, we analyze data from molecular dynamics simulations of protein folding.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1102 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Uncertainty Estimation, Explanation and Reduction with Insufficient Data

    Full text link
    Human beings have been juggling making smart decisions under uncertainties, where we manage to trade off between swift actions and collecting sufficient evidence. It is naturally expected that a generalized artificial intelligence (GAI) to navigate through uncertainties meanwhile predicting precisely. In this thesis, we aim to propose strategies that underpin machine learning with uncertainties from three perspectives: uncertainty estimation, explanation and reduction. Estimation quantifies the variability in the model inputs and outputs. It can endow us to evaluate the model predictive confidence. Explanation provides a tool to interpret the mechanism of uncertainties and to pinpoint the potentials for uncertainty reduction, which focuses on stabilizing model training, especially when the data is insufficient. We hope that this thesis can motivate related studies on quantifying predictive uncertainties in deep learning. It also aims to raise awareness for other stakeholders in the fields of smart transportation and automated medical diagnosis where data insufficiency induces high uncertainty. The thesis is dissected into the following sections: Introduction. we justify the necessity to investigate AI uncertainties and clarify the challenges existed in the latest studies, followed by our research objective. Literature review. We break down the the review of the state-of-the-art methods into uncertainty estimation, explanation and reduction. We make comparisons with the related fields encompassing meta learning, anomaly detection, continual learning as well. Uncertainty estimation. We introduce a variational framework, neural process that approximates Gaussian processes to handle uncertainty estimation. Two variants from the neural process families are proposed to enhance neural processes with scalability and continual learning. Uncertainty explanation. We inspect the functional distribution of neural processes to discover the global and local factors that affect the degree of predictive uncertainties. Uncertainty reduction. We validate the proposed uncertainty framework on two scenarios: urban irregular behaviour detection and neurological disorder diagnosis, where the intrinsic data insufficiency undermines the performance of existing deep learning models. Conclusion. We provide promising directions for future works and conclude the thesis
    • …
    corecore