18 research outputs found

    Stanje pridelave hmelja (Humulus lupulus L.) v Sloveniji

    Get PDF
    Background\ud The increased multi-omics information on carefully phenotyped patients in studies of complex diseases requires novel methods for data integration. Unlike continuous intensity measurements from most omics data sets, phenome data contain clinical variables that are binary, ordinal and categorical.\ud \ud Results\ud In this paper we introduce an integrative phenotyping framework (iPF) for disease subtype discovery. A feature topology plot was developed for effective dimension reduction and visualization of multi-omics data. The approach is free of model assumption and robust to data noises or missingness. We developed a workflow to integrate homogeneous patient clustering from different omics data in an agglomerative manner and then visualized heterogeneous clustering of pairwise omics sources. We applied the framework to two batches of lung samples obtained from patients diagnosed with chronic obstructive lung disease (COPD) or interstitial lung disease (ILD) with well-characterized clinical (phenomic) data, mRNA and microRNA expression profiles. Application of iPF to the first training batch identified clusters of patients consisting of homogenous disease phenotypes as well as clusters with intermediate disease characteristics. Analysis of the second batch revealed a similar data structure, confirming the presence of intermediate clusters. Genes in the intermediate clusters were enriched with inflammatory and immune functional annotations, suggesting that they represent mechanistically distinct disease subphenotypes that may response to immunomodulatory therapies. The iPF software package and all source codes are publicly available.\ud \ud Conclusions\ud Identification of subclusters with distinct clinical and biomolecular characteristics suggests that integration of phenomic and other omics information could lead to identification of novel mechanism-based disease sub-phenotypes

    Additional file 1: of Integrative phenotyping framework (iPF): integrative clustering of multiple omics data identifies novel lung disease subphenotypes

    No full text
    Text S1. Materials and data collection. Text S2. Details of smoothing and Feature Topology Plots (FTP). Text S3. Simulation setting to evaluate iPF. Text S4. Comprehensive validation scheme for iPF. Figure S5. (A) An illustration of integrated omics data sets, (B) A workflow to generate future topology plot (FTP). Figure S6. Flowchart of validation scheme for Integrative phenotyping framework for multiple omics data sets. Figure S7. An example of iPF that utilizes fused multiple data sets at the stage (vi). Figure S8. Examples of iPF using various combinations of the omics data sets (pooled analysis). Figure S9A. The gap statistics and its scree plot to choose the optimal number of clustering (clinical and miRNA data). Figure S9B. The gap statistics and its scree plot to choose the optimal number of clustering (mRNA and miRNA data). Figure S9C. The gap statistics and its scree plot to choose the optimal number of clustering (mRNA and clincal data). Figure S9D. The gap statistics and its scree plot to choose the optimal number of clustering (clincal data and combined data of mRNA and miRNA). Figure S10. The best choice of the number of feature modules. Figure S11. Simulation study shows robust true feature discovery in “Feature Fusion”. The x-axis represents multiplication levels of noise features. The y-axis represents average ARIs from 100 simulations. Each figure is generated based on simulation scenarios of the different number of true features (e.g., 200, 400, and 600, respectively). Figure S12. Immunomodulating drugs target overexpressed genes in module two. Table S13. The description of mRNA and miRNA lung disease data. Table S14. Various correlation types depending on variable attributes. Table S15. The demographic summary of clinical features in each sub-cluster. Table S16. Target gene enrichment analysis (via Fisher exact test) related to twelve. Table S17. Regression analysis on target miRNA features, and coefficient of determination significant miRNA features. Table S18. The top disease or functional annotations associated with genes in module two in Cluster E patients. Figure S19. Basic consensus clustering using only gene expression data. (DOCX 6398 kb

    Integrative genetic and genomic networks identify microRNA associated with COPD and ILD

    No full text
    Abstract Chronic obstructive pulmonary disease (COPD) and interstitial lung disease (ILD) are clinically and molecularly heterogeneous diseases. We utilized clustering and integrative network analyses to elucidate roles for microRNAs (miRNAs) and miRNA isoforms (isomiRs) in COPD and ILD pathogenesis. Short RNA sequencing was performed on 351 lung tissue samples of COPD (n = 145), ILD (n = 144) and controls (n = 64). Five distinct subclusters of samples were identified including 1 COPD-predominant cluster and 2 ILD-predominant clusters which associated with different clinical measurements of disease severity. Utilizing 262 samples with gene expression and SNP microarrays, we built disease-specific genetic and expression networks to predict key miRNA regulators of gene expression. Members of miR-449/34 family, known to promote airway differentiation by repressing the Notch pathway, were among the top connected miRNAs in both COPD and ILD networks. Genes associated with miR-449/34 members in the disease networks were enriched among genes that increase in expression with airway differentiation at an air–liquid interface. A highly expressed isomiR containing a novel seed sequence was identified at the miR-34c-5p locus. 47% of the anticorrelated predicted targets for this isomiR were distinct from the canonical seed sequence for miR-34c-5p. Overexpression of the canonical miR-34c-5p and the miR-34c-5p isomiR with an alternative seed sequence down-regulated NOTCH1 and NOTCH4. However, only overexpression of the isomiR down-regulated genes involved in Ras signaling such as CRKL and GRB2. Overall, these findings elucidate molecular heterogeneity inherent across COPD and ILD patients and further suggest roles for miR-34c in regulating disease-associated gene-expression