100 research outputs found

    Significance of bioinformatics in research of chronic obstructive pulmonary disease

    Get PDF
    Chronic obstructive pulmonary disease (COPD) is an inflammatory disease characterized by the progressive deterioration of pulmonary function and increasing airway obstruction, with high morality all over the world. The advent of high-throughput omics techniques provided an opportunity to gain insights into disease pathogenesis and process which contribute to the heterogeneity, and find target-specific and disease-specific therapies. As an interdispline, bioinformatics supplied vital information on integrative understanding of COPD. This review focused on application of bioinformatics in COPD study, including biomarkers searching and systems biology. We also presented the requirements and challenges in implementing bioinformatics to COPD research and interpreted these results as clinical physicians

    Cluster analysis in severe emphysema subjects using phenotype and genotype data: an exploratory investigation

    Get PDF
    Background: Numerous studies have demonstrated associations between genetic markers and COPD, but results have been inconsistent. One reason may be heterogeneity in disease definition. Unsupervised learning approaches may assist in understanding disease heterogeneity. Methods: We selected 31 phenotypic variables and 12 SNPs from five candidate genes in 308 subjects in the National Emphysema Treatment Trial (NETT) Genetics Ancillary Study cohort. We used factor analysis to select a subset of phenotypic variables, and then used cluster analysis to identify subtypes of severe emphysema. We examined the phenotypic and genotypic characteristics of each cluster. Results: We identified six factors accounting for 75% of the shared variability among our initial phenotypic variables. We selected four phenotypic variables from these factors for cluster analysis: 1) post-bronchodilator FEV1 percent predicted, 2) percent bronchodilator responsiveness, and quantitative CT measurements of 3) apical emphysema and 4) airway wall thickness. K-means cluster analysis revealed four clusters, though separation between clusters was modest: 1) emphysema predominant, 2) bronchodilator responsive, with higher FEV1; 3) discordant, with a lower FEV1 despite less severe emphysema and lower airway wall thickness, and 4) airway predominant. Of the genotypes examined, membership in cluster 1 (emphysema-predominant) was associated with TGFB1 SNP rs1800470. Conclusions: Cluster analysis may identify meaningful disease subtypes and/or groups of related phenotypic variables even in a highly selected group of severe emphysema subjects, and may be useful for genetic association studies

    Evaluation of data processing pipelines on real-world electronic health records data for the purpose of measuring patient similarity

    Get PDF
    BACKGROUND: The ever-growing size, breadth, and availability of patient data allows for a wide variety of clinical features to serve as inputs for phenotype discovery using cluster analysis. Data of mixed types in particular are not straightforward to combine into a single feature vector, and techniques used to address this can be biased towards certain data types in ways that are not immediately obvious or intended. In this context, the process of constructing clinically meaningful patient representations from complex datasets has not been systematically evaluated. AIMS: Our aim was to a) outline and b) implement an analytical framework to evaluate distinct methods of constructing patient representations from routine electronic health record data for the purpose of measuring patient similarity. We applied the analysis on a patient cohort diagnosed with chronic obstructive pulmonary disease. METHODS: Using data from the CALIBER data resource, we extracted clinically relevant features for a cohort of patients diagnosed with chronic obstructive pulmonary disease. We used four different data processing pipelines to construct lower dimensional patient representations from which we calculated patient similarity scores. We described the resulting representations, ranked the influence of each individual feature on patient similarity and evaluated the effect of different pipelines on clustering outcomes. Experts evaluated the resulting representations by rating the clinical relevance of similar patient suggestions with regard to a reference patient. RESULTS: Each of the four pipelines resulted in similarity scores primarily driven by a unique set of features. It was demonstrated that data transformations according to each pipeline prior to clustering can result in a variation of clustering results of over 40%. The most appropriate pipeline was selected on the basis of feature ranking and clinical expertise. There was moderate agreement between clinicians as measured by Cohen's kappa coefficient. CONCLUSIONS: Data transformation has downstream and unforeseen consequences in cluster analysis. Rather than viewing this process as a black box, we have shown ways to quantitatively and qualitatively evaluate and select the appropriate preprocessing pipeline

    COPD phenotypes and machine learning cluster analysis : A systematic review and future research agenda

    Get PDF
    Funding This research did not receive any specific grant from funding agencies in the public, commercial, or ot-for-profit sectors.Peer reviewedPostprin

    Identification of Clinical Phenotypes Using Cluster Analyses in COPD Patients with Multiple Comorbidities

    Get PDF

    Why We Should Target Small Airways Disease in Our Management of Chronic Obstructive Pulmonary Disease

    Get PDF
    ACKNOWLEDGMENTS Editorial support was provided by Cindy Macpherson, PhD, of MediTech Media, UK,and was funded by Boehringer IngelheimPeer reviewedPublisher PD

    Imaging-based clusters in current smokers of the COPD cohort associate with clinical characteristics: The SubPopulations and Intermediate Outcome Measures in COPD Study (SPIROMICS) 11 Medical and Health Sciences 1102 Cardiorespiratory Medicine and Haematology

    Get PDF
    Background: Classification of COPD is usually based on the severity of airflow, which may not sensitively differentiate subpopulations. Using a multiscale imaging-based cluster analysis (MICA), we aim to identify subpopulations for current smokers with COPD. Methods: Among the SPIROMICS subjects, we analyzed computed tomography images at total lung capacity (TLC) and residual volume (RV) of 284 current smokers. Functional variables were derived from registration of TLC and RV images, e.g. functional small airways disease (fSAD%). Structural variables were assessed at TLC images, e.g. emphysema and airway wall thickness and diameter. We employed an unsupervised method for clustering. Results: Four clusters were identified. Cluster 1 had relatively normal airway structures; Cluster 2 had an increase of fSAD% and wall thickness; Cluster 3 exhibited a further increase of fSAD% but a decrease of wall thickness and airway diameter; Cluster 4 had a significant increase of fSAD% and emphysema. Clinically, Cluster 1 showed normal FEV1/FVC and low exacerbations. Cluster 4 showed relatively low FEV1/FVC and high exacerbations. While Cluster 2 and Cluster 3 showed similar exacerbations, Cluster 2 had the highest BMI among all clusters. Conclusions: Association of imaging-based clusters with existing clinical metrics suggests the sensitivity of MICA in differentiating subpopulations
    • …
    corecore