14,245 research outputs found

    Predicting diabetes-related hospitalizations based on electronic health records

    Full text link
    OBJECTIVE: To derive a predictive model to identify patients likely to be hospitalized during the following year due to complications attributed to Type II diabetes. METHODS: A variety of supervised machine learning classification methods were tested and a new method that discovers hidden patient clusters in the positive class (hospitalized) was developed while, at the same time, sparse linear support vector machine classifiers were derived to separate positive samples from the negative ones (non-hospitalized). The convergence of the new method was established and theoretical guarantees were proved on how the classifiers it produces generalize to a test set not seen during training. RESULTS: The methods were tested on a large set of patients from the Boston Medical Center - the largest safety net hospital in New England. It is found that our new joint clustering/classification method achieves an accuracy of 89% (measured in terms of area under the ROC Curve) and yields informative clusters which can help interpret the classification results, thus increasing the trust of physicians to the algorithmic output and providing some guidance towards preventive measures. While it is possible to increase accuracy to 92% with other methods, this comes with increased computational cost and lack of interpretability. The analysis shows that even a modest probability of preventive actions being effective (more than 19%) suffices to generate significant hospital care savings. CONCLUSIONS: Predictive models are proposed that can help avert hospitalizations, improve health outcomes and drastically reduce hospital expenditures. The scope for savings is significant as it has been estimated that in the USA alone, about $5.8 billion are spent each year on diabetes-related hospitalizations that could be prevented.Accepted manuscrip

    The DDG^G-classifier in the functional setting

    Get PDF
    The Maximum Depth was the first attempt to use data depths instead of multivariate raw data to construct a classification rule. Recently, the DD-classifier has solved several serious limitations of the Maximum Depth classifier but some issues still remain. This paper is devoted to extending the DD-classifier in the following ways: first, to surpass the limitation of the DD-classifier when more than two groups are involved. Second to apply regular classification methods (like kkNN, linear or quadratic classifiers, recursive partitioning,...) to DD-plots to obtain useful insights through the diagnostics of these methods. And third, to integrate different sources of information (data depths or multivariate functional data) in a unified way in the classification procedure. Besides, as the DD-classifier trick is especially useful in the functional framework, an enhanced revision of several functional data depths is done in the paper. A simulation study and applications to some classical real datasets are also provided showing the power of the new proposal.Comment: 29 pages, 6 figures, 6 tables, Supplemental R Code and Dat

    'Part'ly first among equals: Semantic part-based benchmarking for state-of-the-art object recognition systems

    Full text link
    An examination of object recognition challenge leaderboards (ILSVRC, PASCAL-VOC) reveals that the top-performing classifiers typically exhibit small differences amongst themselves in terms of error rate/mAP. To better differentiate the top performers, additional criteria are required. Moreover, the (test) images, on which the performance scores are based, predominantly contain fully visible objects. Therefore, `harder' test images, mimicking the challenging conditions (e.g. occlusion) in which humans routinely recognize objects, need to be utilized for benchmarking. To address the concerns mentioned above, we make two contributions. First, we systematically vary the level of local object-part content, global detail and spatial context in images from PASCAL VOC 2010 to create a new benchmarking dataset dubbed PPSS-12. Second, we propose an object-part based benchmarking procedure which quantifies classifiers' robustness to a range of visibility and contextual settings. The benchmarking procedure relies on a semantic similarity measure that naturally addresses potential semantic granularity differences between the category labels in training and test datasets, thus eliminating manual mapping. We use our procedure on the PPSS-12 dataset to benchmark top-performing classifiers trained on the ILSVRC-2012 dataset. Our results show that the proposed benchmarking procedure enables additional differentiation among state-of-the-art object classifiers in terms of their ability to handle missing content and insufficient object detail. Given this capability for additional differentiation, our approach can potentially supplement existing benchmarking procedures used in object recognition challenge leaderboards.Comment: Extended version of our ACCV-2016 paper. Author formatting modifie
    • …
    corecore