16 research outputs found

    netANOVA: novel graph clustering technique with significance assessment via hierarchical ANOVA.

    Full text link
    peer reviewedMany problems in life sciences can be brought back to a comparison of graphs. Even though a multitude of such techniques exist, often, these assume prior knowledge about the partitioning or the number of clusters and fail to provide statistical significance of observed between-network heterogeneity. Addressing these issues, we developed an unsupervised workflow to identify groups of graphs from reliable network-based statistics. In particular, we first compute the similarity between networks via appropriate distance measures between graphs and use them in an unsupervised hierarchical algorithm to identify classes of similar networks. Then, to determine the optimal number of clusters, we recursively test for distances between two groups of networks. The test itself finds its inspiration in distance-wise ANOVA algorithms. Finally, we assess significance via the permutation of between-object distance matrices. Notably, the approach, which we will call netANOVA, is flexible since users can choose multiple options to adapt to specific contexts and network types. We demonstrate the benefits and pitfalls of our approach via extensive simulations and an application to two real-life datasets. NetANOVA achieved high performance in many simulation scenarios while controlling type I error. On non-synthetic data, comparison against state-of-the-art methods showed that netANOVA is often among the top performers. There are many application fields, including precision medicine, for which identifying disease subtypes via individual-level biological networks improves prevention programs, diagnosis and disease monitoring

    netMUG: a novel network-guided multi-view clustering workflow for dissecting genetic and facial heterogeneity.

    Full text link
    [en] UNLABELLED: Multi-view data offer advantages over single-view data for characterizing individuals, which is crucial in precision medicine toward personalized prevention, diagnosis, or treatment follow-up. Here, we develop a network-guided multi-view clustering framework named netMUG to identify actionable subgroups of individuals. This pipeline first adopts sparse multiple canonical correlation analysis to select multi-view features possibly informed by extraneous data, which are then used to construct individual-specific networks (ISNs). Finally, the individual subtypes are automatically derived by hierarchical clustering on these network representations. We applied netMUG to a dataset containing genomic data and facial images to obtain BMI-informed multi-view strata and showed how it could be used for a refined obesity characterization. Benchmark analysis of netMUG on synthetic data with known strata of individuals indicated its superior performance compared with both baseline and benchmark methods for multi-view clustering. In addition, the real-data analysis revealed subgroups strongly linked to BMI and genetic and facial determinants of these classes. NetMUG provides a powerful strategy, exploiting individual-specific networks to identify meaningful and actionable strata. Moreover, the implementation is easy to generalize to accommodate heterogeneous data sources or highlight data structures. AUTHOR SUMMARY: In recent years, we see the increasing possibility of collecting data from multiple modalities in various fields, requesting novel methods to exploit the consensus among different data types. As exemplified in systems biology or epistasis analyses, the interactions between features may contain more information than the features themselves, thereby necessitating the use of feature networks. Furthermore, in real-life scenarios, subjects, such as patients or individuals, may originate from diverse populations, which underscores the importance of subtyping or clustering these subjects to account for their heterogeneity. In this study, we present a novel pipeline for selecting the most relevant features from multiple data types, constructing a feature network for each subject, and obtaining a subgrouping of samples informed by a phenotype of interest. We validated our method on synthetic data and demonstrated its superiority over several state-of-the-art multi-view clustering approaches. Additionally, we applied our method to a real-life, large-scale dataset of genomic data and facial images, where it effectively identified a meaningful BMI subtyping that complemented existing BMI categories and offered new biological insights. Our proposed method has wide applicability to complex multi-view or multi-omics datasets for tasks such as disease subtyping or personalized medicine

    Targeted methylation profiling of single laser-capture microdissected post-mortem brain cells by adapted limiting dilution bisulfite pyrosequencing (LDBSP)

    Full text link
    A reoccurring issue in neuroepigenomic studies, especially in the context of neurodegenerative disease, is the use of (heterogeneous) bulk tissue, which generates noise during epigenetic profiling. A workable solution to this issue is to quantify epigenetic patterns in individually isolated neuronal cells using laser capture microdissection (LCM). For this purpose, we established a novel approach for targeted DNA methylation profiling of individual genes that relies on a combination of LCM and limiting dilution bisulfite pyrosequencing (LDBSP). Using this approach, we determined cytosine-phosphate-guanine (CpG) methylation rates of single alleles derived from 50 neurons that were isolated from unfixed post-mortem brain tissue. In the present manuscript, we describe the general workflow and, as a showcase, demonstrate how targeted methylation analysis of various genes, in this case, RHBDF2, OXT, TNXB, DNAJB13, PGLYRP1, C3, and LMX1B, can be performed simultaneously. By doing so, we describe an adapted data analysis pipeline for LDBSP, allowing one to include and correct CpG methylation rates derived from multi-allele reactions. In addition, we show that the efficiency of LDBSP on DNA derived from LCM neurons is similar to the efficiency obtained in previously published studies using this technique on other cell types. Overall, the method described here provides the user with a more accurate estimation of the DNA methylation status of each target gene in the analyzed cell pools, thereby adding further validity to this approach

    netMUG: a novel network-guided multi-view clustering workflow for dissecting genetic and facial heterogeneity

    Get PDF
    Introduction: Multi-view data offer advantages over single-view data for characterizing individuals, which is crucial in precision medicine toward personalized prevention, diagnosis, or treatment follow-up.Methods: Here, we develop a network-guided multi-view clustering framework named netMUG to identify actionable subgroups of individuals. This pipeline first adopts sparse multiple canonical correlation analysis to select multi-view features possibly informed by extraneous data, which are then used to construct individual-specific networks (ISNs). Finally, the individual subtypes are automatically derived by hierarchical clustering on these network representations.Results: We applied netMUG to a dataset containing genomic data and facial images to obtain BMI-informed multi-view strata and showed how it could be used for a refined obesity characterization. Benchmark analysis of netMUG on synthetic data with known strata of individuals indicated its superior performance compared with both baseline and benchmark methods for multi-view clustering. The clustering derived from netMUG achieved an adjusted Rand index of 1 with respect to the synthesized true labels. In addition, the real-data analysis revealed subgroups strongly linked to BMI and genetic and facial determinants of these subgroups.Discussion: netMUG provides a powerful strategy, exploiting individual-specific networks to identify meaningful and actionable strata. Moreover, the implementation is easy to generalize to accommodate heterogeneous data sources or highlight data structures

    Heterogeneity between networks in systems biomedicine: detection and remediation

    Full text link
    corecore