34,707 research outputs found

    Supersparse Linear Integer Models for Optimized Medical Scoring Systems

    Full text link
    Scoring systems are linear classification models that only require users to add, subtract and multiply a few small numbers in order to make a prediction. These models are in widespread use by the medical community, but are difficult to learn from data because they need to be accurate and sparse, have coprime integer coefficients, and satisfy multiple operational constraints. We present a new method for creating data-driven scoring systems called a Supersparse Linear Integer Model (SLIM). SLIM scoring systems are built by solving an integer program that directly encodes measures of accuracy (the 0-1 loss) and sparsity (the â„“0\ell_0-seminorm) while restricting coefficients to coprime integers. SLIM can seamlessly incorporate a wide range of operational constraints related to accuracy and sparsity, and can produce highly tailored models without parameter tuning. We provide bounds on the testing and training accuracy of SLIM scoring systems, and present a new data reduction technique that can improve scalability by eliminating a portion of the training data beforehand. Our paper includes results from a collaboration with the Massachusetts General Hospital Sleep Laboratory, where SLIM was used to create a highly tailored scoring system for sleep apnea screeningComment: This version reflects our findings on SLIM as of January 2016 (arXiv:1306.5860 and arXiv:1405.4047 are out-of-date). The final published version of this articled is available at http://www.springerlink.co

    Data fusion techniques for biomedical informatics and clinical decision support

    Get PDF
    Data fusion can be used to combine multiple data sources or modalities to facilitate enhanced visualization, analysis, detection, estimation, or classification. Data fusion can be applied at the raw-data, feature-based, and decision-based levels. Data fusion applications of different sorts have been built up in areas such as statistics, computer vision and other machine learning aspects. It has been employed in a variety of realistic scenarios such as medical diagnosis, clinical decision support, and structural health monitoring. This dissertation includes investigation and development of methods to perform data fusion for cervical cancer intraepithelial neoplasia (CIN) and a clinical decision support system. The general framework for these applications includes image processing followed by feature development and classification of the detected region of interest (ROI). Image processing methods such as k-means clustering based on color information, dilation, erosion and centroid locating methods were used for ROI detection. The features extracted include texture, color, nuclei-based and triangle features. Analysis and classification was performed using feature- and decision-level data fusion techniques such as support vector machine, statistical methods such as logistic regression, linear discriminant analysis and voting algorithms --Abstract, page iv

    EEG sleep stages identification based on weighted undirected complex networks

    Get PDF
    Sleep scoring is important in sleep research because any errors in the scoring of the patient's sleep electroencephalography (EEG) recordings can cause serious problems such as incorrect diagnosis, medication errors, and misinterpretations of patient's EEG recordings. The aim of this research is to develop a new automatic method for EEG sleep stages classification based on a statistical model and weighted brain networks. Methods each EEG segment is partitioned into a number of blocks using a sliding window technique. A set of statistical features are extracted from each block. As a result, a vector of features is obtained to represent each EEG segment. Then, the vector of features is mapped into a weighted undirected network. Different structural and spectral attributes of the networks are extracted and forwarded to a least square support vector machine (LS-SVM) classifier. At the same time the network's attributes are also thoroughly investigated. It is found that the network's characteristics vary with their sleep stages. Each sleep stage is best represented using the key features of their networks. Results In this paper, the proposed method is evaluated using two datasets acquired from different channels of EEG (Pz-Oz and C3-A2) according to the R&K and the AASM without pre-processing the original EEG data. The obtained results by the LS-SVM are compared with those by Naïve, k-nearest and a multi-class-SVM. The proposed method is also compared with other benchmark sleep stages classification methods. The comparison results demonstrate that the proposed method has an advantage in scoring sleep stages based on single channel EEG signals. Conclusions An average accuracy of 96.74% is obtained with the C3-A2 channel according to the AASM standard, and 96% with the Pz-Oz channel based on the R&K standard
    • …
    corecore