21 research outputs found
A conserved BDNF, glutamate- and GABA-enriched gene module related to human depression identified by coexpression meta-analysis and DNA variant genome-wide association studies
Large scale gene expression (transcriptome) analysis and genome-wide association studies (GWAS) for single nucleotide polymorphisms have generated a considerable amount of gene- and disease-related information, but heterogeneity and various sources of noise have limited the discovery of disease mechanisms. As systematic dataset integration is becoming essential, we developed methods and performed meta-clustering of gene coexpression links in 11 transcriptome studies from postmortem brains of human subjects with major depressive disorder (MDD) and non-psychiatric control subjects. We next sought enrichment in the top 50 meta-analyzed coexpression modules for genes otherwise identified by GWAS for various sets of disorders. One coexpression module of 88 genes was consistently and significantly associated with GWAS for MDD, other neuropsychiatric disorders and brain functions, and for medical illnesses with elevated clinical risk of depression, but not for other diseases. In support of the superior discriminative power of this novel approach, we observed no significant enrichment for GWAS-related genes in coexpression modules extracted from single studies or in meta-modules using gene expression data from non-psychiatric control subjects. Genes in the identified module encode proteins implicated in neuronal signaling and structure, including glutamate metabotropic receptors (GRM1, GRM7), GABA receptors (GABRA2, GABRA4), and neurotrophic and development-related proteins [BDNF, reelin (RELN), Ephrin receptors (EPHA3, EPHA5)]. These results are consistent with the current understanding of molecular mechanisms of MDD and provide a set of putative interacting molecular partners, potentially reflecting components of a functional module across cells and biological pathways that are synchronously recruited in MDD, other brain disorders and MDD-related illnesses. Collectively, this study demonstrates the importance of integrating transcriptome data, gene coexpression modules and GWAS results for providing novel and complementary approaches to investigate the molecular pathology of MDD and other complex brain disorders. © 2014 Chang et al
A comparison of four clustering methods for brain expression microarray data
Background
DNA microarrays, which determine the expression levels of tens of thousands of genes from a sample, are an important research tool. However, the volume of data they produce can be an obstacle to interpretation of the results. Clustering the genes on the basis of similarity of their expression profiles can simplify the data, and potentially provides an important source of biological inference, but these methods have not been tested systematically on datasets from complex human tissues. In this paper, four clustering methods, CRC, k-means, ISA and memISA, are used upon three brain expression datasets. The results are compared on speed, gene coverage and GO enrichment. The effects of combining the clusters produced by each method are also assessed.
Results
k-means outperforms the other methods, with 100% gene coverage and GO enrichments only slightly exceeded by memISA and ISA. Those two methods produce greater GO enrichments on the datasets used, but at the cost of much lower gene coverage, fewer clusters produced, and speed. The clusters they find are largely different to those produced by k-means. Combining clusters produced by k-means and memISA or ISA leads to increased GO enrichment and number of clusters produced (compared to k-means alone), without negatively impacting gene coverage. memISA can also find potentially disease-related clusters. In two independent dorsolateral prefrontal cortex datasets, it finds three overlapping clusters that are either enriched for genes associated with schizophrenia, genes differentially expressed in schizophrenia, or both. Two of these clusters are enriched for genes of the MAP kinase pathway, suggesting a possible role for this pathway in the aetiology of schizophrenia.
Conclusion
Considered alone, k-means clustering is the most effective of the four methods on typical microarray brain expression datasets. However, memISA and ISA can add extra high-quality clusters to the set produced by k-means, so combining these three methods is the method of choice
Jacobian-Scaled K-means Clustering for Physics-Informed Segmentation of Reacting Flows
This work introduces Jacobian-scaled K-means (JSK-means) clustering, which is
a physics-informed clustering strategy centered on the K-means framework. The
method allows for the injection of underlying physical knowledge into the
clustering procedure through a distance function modification: instead of
leveraging conventional Euclidean distance vectors, the JSK-means procedure
operates on distance vectors scaled by matrices obtained from dynamical system
Jacobians evaluated at the cluster centroids. The goal of this work is to show
how the JSK-means algorithm -- without modifying the input dataset -- produces
clusters that capture regions of dynamical similarity, in that the clusters are
redistributed towards high-sensitivity regions in phase space and are described
by similarity in the source terms of samples instead of the samples themselves.
The algorithm is demonstrated on a complex reacting flow simulation dataset (a
channel detonation configuration), where the dynamics in the thermochemical
composition space are known through the highly nonlinear and stiff
Arrhenius-based chemical source terms. Interpretations of cluster partitions in
both physical space and composition space reveal how JSK-means shifts clusters
produced by standard K-means towards regions of high chemical sensitivity
(e.g., towards regions of peak heat release rate near the detonation reaction
zone). The findings presented here illustrate the benefits of utilizing
Jacobian-scaled distances in clustering techniques, and the JSK-means method in
particular displays promising potential for improving former partition-based
modeling strategies in reacting flow (and other multi-physics) applications