1,044 research outputs found

    Information-Theoretic Measures for Objective Evaluation of Classifications

    Full text link
    This work presents a systematic study of objective evaluations of abstaining classifications using Information-Theoretic Measures (ITMs). First, we define objective measures for which they do not depend on any free parameter. This definition provides technical simplicity for examining "objectivity" or "subjectivity" directly to classification evaluations. Second, we propose twenty four normalized ITMs, derived from either mutual information, divergence, or cross-entropy, for investigation. Contrary to conventional performance measures that apply empirical formulas based on users' intuitions or preferences, the ITMs are theoretically more sound for realizing objective evaluations of classifications. We apply them to distinguish "error types" and "reject types" in binary classifications without the need for input data of cost terms. Third, to better understand and select the ITMs, we suggest three desirable features for classification assessment measures, which appear more crucial and appealing from the viewpoint of classification applications. Using these features as "meta-measures", we can reveal the advantages and limitations of ITMs from a higher level of evaluation knowledge. Numerical examples are given to corroborate our claims and compare the differences among the proposed measures. The best measure is selected in terms of the meta-measures, and its specific properties regarding error types and reject types are analytically derived.Comment: 25 Pages, 1 Figure, 10 Table

    An analysis of the user occupational class through Twitter content

    Get PDF
    Social media content can be used as a complementary source to the traditional methods for extracting and studying collective social attributes. This study focuses on the prediction of the occupational class for a public user profile. Our analysis is conducted on a new annotated corpus of Twitter users, their respective job titles, posted textual content and platform-related attributes. We frame our task as classification using latent feature representations such as word clusters and embeddings. The employed linear and, especially, non-linear methods can predict a userā€™s occupational class with strong accuracy for the coarsest level of a standard occupation taxonomy which includes nine classes. Combined with a qualitative assessment, the derived results confirm the feasibility of our approach in inferring a new user attribute that can be embedded in a multitude of downstream applications

    Decision Analysis with Geographically Varying Outcomes: Preference Models and Illustrative Applications

    Get PDF
    DRMI Working Paper SeriesThe series is intended to convey the preliminary results of [DRMI] ongoing research. The research described in these papers is preliminary and has not completed the usual review process for Institute publications. We welcome feedback from readers and encourage you to convey your comments and criticisms directly to the authors

    Detecting Premature Ventricular Contraction by using Regulated Discriminant Analysis with very sparse training data

    Get PDF
    Pathological electrocardiogram is often used to diagnose abnormal cardiac disorders where accurate classification of the cardiac beat types is crucial for timely diagnosis of dangerous conditions. However, accurate, timely, and precise detection of arrhythmia-types like premature ventricular contraction is very challenging as these signals are multiform, i.e. a reliable detection of these requires expert annotations. In this paper, a multivariate statistical classifier that is able to detect premature ventricular contraction beats is presented. This novel classifier can be trained with a very sparse amount of expert annotated data. To enable this, the dimensionality of the feature vector is kept very low, it uses strong designed features and a regularization mechanism. This approach is compared to other classifiers by using the MIT-BIH arrhythmia database. It has been found that the average accuracy, specificity, and sensitivity are above 96%, which is superior given the sparse amount of training data

    Community Detection in Weighted Multilayer Networks with Ambient Noise

    Full text link
    We introduce a novel class of stochastic blockmodel for multilayer weighted networks that accounts for the presence of a global ambient noise that governs between-block interactions. We induce a hierarchy of classifications in weighted multilayer networks by assuming that all but one cluster (block) are governed by unique local signals, while a single block is classified as ambient noise, which behaves identically as interactions across differing blocks. Hierarchical variational inference is employed to jointly detect and typologize block-structures as local signals or global noise. These principles are incorporated into novel community detection algorithm called Stochastic Block (with) Ambient Noise Model (SBANM) for multilayer weighted networks. We apply this method to several different domains. We focus on the Philadelphia Neurodevelopmental Cohort to discover communities of subjects that form diagnostic categories relating psychopathological symptoms to psychosis.Comment: 27 page
    • ā€¦
    corecore