16,425 research outputs found

    Have Econometric Analyses of Happiness Data Been Futile? A Simple Truth About Happiness Scales

    Full text link
    Econometric analyses in the happiness literature typically use subjective well-being (SWB) data to compare the mean of observed or latent happiness across samples. Recent critiques show that comparing the mean of ordinal data is only valid under strong assumptions that are usually rejected by SWB data. This leads to an open question whether much of the empirical studies in the economics of happiness literature have been futile. In order to salvage some of the prior results and avoid future issues, we suggest regression analysis of SWB (and other ordinal data) should focus on the median rather than the mean. Median comparisons using parametric models such as the ordered probit and logit can be readily carried out using familiar statistical softwares like STATA. We also show a previously assumed impractical task of estimating a semiparametric median ordered-response model is also possible by using a novel constrained mixed integer optimization technique. We use GSS data to show the famous Easterlin Paradox from the happiness literature holds for the US independent of any parametric assumption

    A comprehensive literature classification of simulation optimisation methods

    Get PDF
    Simulation Optimization (SO) provides a structured approach to the system design and configuration when analytical expressions for input/output relationships are unavailable. Several excellent surveys have been written on this topic. Each survey concentrates on only few classification criteria. This paper presents a literature survey with all classification criteria on techniques for SO according to the problem of characteristics such as shape of the response surface (global as compared to local optimization), objective functions (single or multiple objectives) and parameter spaces (discrete or continuous parameters). The survey focuses specifically on the SO problem that involves single per-formance measureSimulation Optimization, classification methods, literature survey

    Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence

    Full text link
    Data in the form of pairwise comparisons arises in many domains, including preference elicitation, sporting competitions, and peer grading among others. We consider parametric ordinal models for such pairwise comparison data involving a latent vector w∗∈Rdw^* \in \mathbb{R}^d that represents the "qualities" of the dd items being compared; this class of models includes the two most widely used parametric models--the Bradley-Terry-Luce (BTL) and the Thurstone models. Working within a standard minimax framework, we provide tight upper and lower bounds on the optimal error in estimating the quality score vector w∗w^* under this class of models. The bounds depend on the topology of the comparison graph induced by the subset of pairs being compared via its Laplacian spectrum. Thus, in settings where the subset of pairs may be chosen, our results provide principled guidelines for making this choice. Finally, we compare these error rates to those under cardinal measurement models and show that the error rates in the ordinal and cardinal settings have identical scalings apart from constant pre-factors.Comment: 39 pages, 5 figures. Significant extension of arXiv:1406.661

    Semi-parametric analysis of multi-rater data

    Get PDF
    Datasets that are subjectively labeled by a number of experts are becoming more common in tasks such as biological text annotation where class definitions are necessarily somewhat subjective. Standard classification and regression models are not suited to multiple labels and typically a pre-processing step (normally assigning the majority class) is performed. We propose Bayesian models for classification and ordinal regression that naturally incorporate multiple expert opinions in defining predictive distributions. The models make use of Gaussian process priors, resulting in great flexibility and particular suitability to text based problems where the number of covariates can be far greater than the number of data instances. We show that using all labels rather than just the majority improves performance on a recent biological dataset

    Encrypted statistical machine learning: new privacy preserving methods

    Full text link
    We present two new statistical machine learning methods designed to learn on fully homomorphic encrypted (FHE) data. The introduction of FHE schemes following Gentry (2009) opens up the prospect of privacy preserving statistical machine learning analysis and modelling of encrypted data without compromising security constraints. We propose tailored algorithms for applying extremely random forests, involving a new cryptographic stochastic fraction estimator, and na\"{i}ve Bayes, involving a semi-parametric model for the class decision boundary, and show how they can be used to learn and predict from encrypted data. We demonstrate that these techniques perform competitively on a variety of classification data sets and provide detailed information about the computational practicalities of these and other FHE methods.Comment: 39 page
    • …
    corecore