1,954 research outputs found

    Classification performance for covid patient prognosis from automatic ai segmentation—a single-center study

    Get PDF
    Background: COVID assessment can be performed using the recently developed individual risk score (prediction of severe respiratory failure in hospitalized patients with SARS-COV2 infection, PREDI-CO score) based on High Resolution Computed Tomography. In this study, we evaluated the possibility of automatizing this estimation using semi-supervised AI-based Radiomics, leveraging the possibility of performing non-supervised segmentation of ground-glass areas. Methods: We collected 92 from patients treated in the IRCCS Sant’Orsola-Malpighi Policlinic and public databases; each lung was segmented using a pre-trained AI method; ground-glass opacity was identified using a novel, non-supervised approach; radiomic measurements were collected and used to predict clinically relevant scores, with particular focus on mortality and the PREDI-CO score. We compared the prediction obtained through different machine learning approaches. Results: All the methods obtained a well-balanced accuracy (70%) on the PREDI-CO score but did not obtain satisfying results on other clinical characteristics due to unbalance between the classes. Conclusions: Semi-supervised segmentation, implemented using a combination of non-supervised segmentation and feature extraction, seems to be a viable approach for patient stratification and could be leveraged to train more complex models. This would be useful in a high-demand situation similar to the current pandemic to support gold-standard segmentation for AI training

    M3C: Monte Carlo reference-based consensus clustering.

    Get PDF
    Genome-wide data is used to stratify patients into classes for precision medicine using clustering algorithms. A common problem in this area is selection of the number of clusters (K). The Monti consensus clustering algorithm is a widely used method which uses stability selection to estimate K. However, the method has bias towards higher values of K and yields high numbers of false positives. As a solution, we developed Monte Carlo reference-based consensus clustering (M3C), which is based on this algorithm. M3C simulates null distributions of stability scores for a range of K values thus enabling a comparison with real data to remove bias and statistically test for the presence of structure. M3C corrects the inherent bias of consensus clustering as demonstrated on simulated and real expression data from The Cancer Genome Atlas (TCGA). For testing M3C, we developed clusterlab, a new method for simulating multivariate Gaussian clusters

    Deep Risk Prediction and Embedding of Patient Data: Application to Acute Gastrointestinal Bleeding

    Get PDF
    Acute gastrointestinal bleeding is a common and costly condition, accounting for over 2.2 million hospital days and 19.2 billion dollars of medical charges annually. Risk stratification is a critical part of initial assessment of patients with acute gastrointestinal bleeding. Although all national and international guidelines recommend the use of risk-assessment scoring systems, they are not commonly used in practice, have sub-optimal performance, may be applied incorrectly, and are not easily updated. With the advent of widespread electronic health record adoption, longitudinal clinical data captured during the clinical encounter is now available. However, this data is often noisy, sparse, and heterogeneous. Unsupervised machine learning algorithms may be able to identify structure within electronic health record data while accounting for key issues with the data generation process: measurements missing-not-at-random and information captured in unstructured clinical note text. Deep learning tools can create electronic health record-based models that perform better than clinical risk scores for gastrointestinal bleeding and are well-suited for learning from new data. Furthermore, these models can be used to predict risk trajectories over time, leveraging the longitudinal nature of the electronic health record. The foundation of creating relevant tools is the definition of a relevant outcome measure; in acute gastrointestinal bleeding, a composite outcome of red blood cell transfusion, hemostatic intervention, and all-cause 30-day mortality is a relevant, actionable outcome that reflects the need for hospital-based intervention. However, epidemiological trends may affect the relevance and effectiveness of the outcome measure when applied across multiple settings and patient populations. Understanding the trends in practice, potential areas of disparities, and value proposition for using risk stratification in patients presenting to the Emergency Department with acute gastrointestinal bleeding is important in understanding how to best implement a robust, generalizable risk stratification tool. Key findings include a decrease in the rate of red blood cell transfusion since 2014 and disparities in access to upper endoscopy for patients with upper gastrointestinal bleeding by race/ethnicity across urban and rural hospitals. Projected accumulated savings of consistent implementation of risk stratification tools for upper gastrointestinal bleeding total approximately $1 billion 5 years after implementation. Most current risk scores were designed for use based on the location of the bleeding source: upper or lower gastrointestinal tract. However, the location of the bleeding source is not always clear at presentation. I develop and validate electronic health record based deep learning and machine learning tools for patients presenting with symptoms of acute gastrointestinal bleeding (e.g., hematemesis, melena, hematochezia), which is more relevant and useful in clinical practice. I show that they outperform leading clinical risk scores for upper and lower gastrointestinal bleeding, the Glasgow Blatchford Score and the Oakland score. While the best performing gradient boosted decision tree model has equivalent overall performance to the fully connected feedforward neural network model, at the very low risk threshold of 99% sensitivity the deep learning model identifies more very low risk patients. Using another deep learning model that can model longitudinal risk, the long-short-term memory recurrent neural network, need for transfusion of red blood cells can be predicted at every 4-hour interval in the first 24 hours of intensive care unit stay for high risk patients with acute gastrointestinal bleeding. Finally, for implementation it is important to find patients with symptoms of acute gastrointestinal bleeding in real time and characterize patients by risk using available data in the electronic health record. A decision rule-based electronic health record phenotype has equivalent performance as measured by positive predictive value compared to deep learning and natural language processing-based models, and after live implementation appears to have increased the use of the Acute Gastrointestinal Bleeding Clinical Care pathway. Patients with acute gastrointestinal bleeding but with other groups of disease concepts can be differentiated by directly mapping unstructured clinical text to a common ontology and treating the vector of concepts as signals on a knowledge graph; these patients can be differentiated using unbalanced diffusion earth mover’s distances on the graph. For electronic health record data with data missing not at random, MURAL, an unsupervised random forest-based method, handles data with missing values and generates visualizations that characterize patients with gastrointestinal bleeding. This thesis forms a basis for understanding the potential for machine learning and deep learning tools to characterize risk for patients with acute gastrointestinal bleeding. In the future, these tools may be critical in implementing integrated risk assessment to keep low risk patients out of the hospital and guide resuscitation and timely endoscopic procedures for patients at higher risk for clinical decompensation

    Evolutionary multiobjective clustering algorithms with ensemble for patient stratification

    Get PDF
    The file attached to this record is the author's final peer reviewed version.Patient stratification has been studied widely to tackle subtype diagnosis problems for effective treatment. Due to the dimensionality curse and poor interpretability of data, there is always a long-lasting challenge in constructing a stratification model with high diagnostic ability and good generalization. To address these problems, this paper proposes two novel evolutionary multiobjective clustering algorithms with ensemble (NSGA-II-ECFE and MOEA/D-ECFE) with four cluster validity indices used as the objective functions. First, an effective ensemble construction method is developed to enrich the ensemble diversity. After that, an ensemble clustering fitness evaluation (ECFE) method is proposed to evaluate the ensembles by measuring the consensus clustering under those four objective functions. To generate the consensus clustering, ECFE exploits the hybrid co-association matrix from the ensembles and then dynamically selects the suitable clustering algorithm on that matrix. Multiple experiments have been conducted to demonstrate the effectiveness of the proposed algorithm in comparison with seven clustering algorithms, twelve ensemble clustering approaches, and two multiobjective clustering algorithms on 55 synthetic datasets and 35 real patient stratification datasets. The experimental results demonstrate the competitive edges of the proposed algorithms over those compared methods. Furthermore, the proposed algorithm is applied to extend its advantages by identifying cancer subtypes from five cancer-related single-cell RNA-seq datasets

    Computational methods for physiological data

    Get PDF
    Thesis (Ph. D.)--Harvard-MIT Division of Health Sciences and Technology, 2009.Author is also affiliated with the MIT Dept. of Electrical Engineering and Computer Science. Cataloged from PDF version of thesis.Includes bibliographical references (p. 177-188).Large volumes of continuous waveform data are now collected in hospitals. These datasets provide an opportunity to advance medical care, by capturing rare or subtle phenomena associated with specific medical conditions, and by providing fresh insights into disease dynamics over long time scales. We describe how progress in medicine can be accelerated through the use of sophisticated computational methods for the structured analysis of large multi-patient, multi-signal datasets. We propose two new approaches, morphologic variability (MV) and physiological symbolic analysis, for the analysis of continuous long-term signals. MV studies subtle micro-level variations in the shape of physiological signals over long periods. These variations, which are often widely considered to be noise, can contain important information about the state of the underlying system. Symbolic analysis studies the macro-level information in signals by abstracting them into symbolic sequences. Converting continuous waveforms into symbolic sequences facilitates the development of efficient algorithms to discover high risk patterns and patients who are outliers in a population. We apply our methods to the clinical challenge of identifying patients at high risk of cardiovascular mortality (almost 30% of all deaths worldwide each year). When evaluated on ECG data from over 4,500 patients, high MV was strongly associated with both cardiovascular death and sudden cardiac death. MV was a better predictor of these events than other ECG-based metrics. Furthermore, these results were independent of information in echocardiography, clinical characteristics, and biomarkers.(cont.) Our symbolic analysis techniques also identified groups of patients exhibiting a varying risk of adverse outcomes. One group, with a particular set of symbolic characteristics, showed a 23 fold increased risk of death in the months following a mild heart attack, while another exhibited a 5 fold increased risk of future heart attacks.by Zeeshan Hassan Syed.Ph.D
    • …
    corecore