186 research outputs found

    SUBIC: A Supervised Bi-Clustering Approach for Precision Medicine

    Full text link
    Traditional medicine typically applies one-size-fits-all treatment for the entire patient population whereas precision medicine develops tailored treatment schemes for different patient subgroups. The fact that some factors may be more significant for a specific patient subgroup motivates clinicians and medical researchers to develop new approaches to subgroup detection and analysis, which is an effective strategy to personalize treatment. In this study, we propose a novel patient subgroup detection method, called Supervised Biclustring (SUBIC) using convex optimization and apply our approach to detect patient subgroups and prioritize risk factors for hypertension (HTN) in a vulnerable demographic subgroup (African-American). Our approach not only finds patient subgroups with guidance of a clinically relevant target variable but also identifies and prioritizes risk factors by pursuing sparsity of the input variables and encouraging similarity among the input variables and between the input and target variable

    GEMS: a web server for biclustering analysis of expression data

    Get PDF
    The advent of microarray technology has revolutionized the search for genes that are differentially expressed across a range of cell types or experimental conditions. Traditional clustering methods, such as hierarchical clustering, are often difficult to deploy effectively since genes rarely exhibit similar expression pattern across a wide range of conditions. Biclustering of gene expression data (also called co-clustering or two-way clustering) is a non-trivial but promising methodology for the identification of gene groups that show a coherent expression profile across a subset of conditions. Thus, biclustering is a natural methodology as a screen for genes that are functionally related, participate in the same pathways, affected by the same drug or pathological condition, or genes that form modules that are potentially co-regulated by a small group of transcription factors. We have developed a web-enabled service called GEMS (Gene Expression Mining Server) for biclustering microarray data. Users may upload expression data and specify a set of criteria. GEMS then performs bicluster mining based on a Gibbs sampling paradigm. The web server provides a flexible and an useful platform for the discovery of co-expressed and potentially co-regulated gene modules. GEMS is an open source software and is available at

    GEMS: a web server for biclustering analysis of expression data

    Get PDF
    The advent of microarray technology has revolutionized the search for genes that are differentially expressed across a range of cell types or experimental conditions. Traditional clustering methods, such as hierarchical clustering, are often difficult to deploy effectively since genes rarely exhibit similar expression pattern across a wide range of conditions. Biclustering of gene expression data (also called co-clustering or two-way clustering) is a non-trivial but promising methodology for the identification of gene groups that show a coherent expression profile across a subset of conditions. Thus, biclustering is a natural methodology as a screen for genes that are functionally related, participate in the same pathways, affected by the same drug or pathological condition, or genes that form modules that are potentially co-regulated by a small group of transcription factors. We have developed a web-enabled service called GEMS (Gene Expression Mining Server) for biclustering microarray data. Users may upload expression data and specify a set of criteria. GEMS then performs bicluster mining based on a Gibbs sampling paradigm. The web server provides a flexible and an useful platform for the discovery of co-expressed and potentially co-regulated gene modules. GEMS is an open source software and is available at

    Bayesian Biclustering of Gene Expression Data

    Get PDF
    Background: Biclustering of gene expression data searches for local patterns of gene expression. A bicluster (or a two-way cluster) is defined as a set of genes whose expression profiles are mutually similar within a subset of experimental conditions/samples. Although several biclustering algorithms have been studied, few are based on rigorous statistical models. Results: We developed a Bayesian biclustering model (BBC), and implemented a Gibbs sampling procedure for its statistical inference. We showed that Bayesian biclustering model can correctly identify multiple clusters of gene expression data. Using simulated data both from the model and with realistic characters, we demonstrated the BBC algorithm outperforms other methods in both robustness and accuracy. We also showed that the model is stable for two normalization methods, the interquartile range normalization and the smallest quartile range normalization. Applying the BBC algorithm to the yeast expression data, we observed that majority of the biclusters we found are supported by significant biological evidences, such as enrichments of gene functions and transcription factor binding sites in the corresponding promoter sequences. Conclusions: The BBC algorithm is shown to be a robust model-based biclustering method that can discover biologically significant gene-condition clusters in microarray data. The BBC model can easily handle missing data via Monte Carlo imputation and has the potential to be extended to integrated study of gene transcription networks.Statistic

    Conjoined Dirichlet Process

    Get PDF
    Biclustering is a class of techniques that simultaneously clusters the rows and columns of a matrix to sort heterogeneous data into homogeneous blocks. Although many algorithms have been proposed to find biclusters, existing methods suffer from the pre-specification of the number of biclusters or place constraints on the model structure. To address these issues, we develop a novel, non-parametric probabilistic biclustering method based on Dirichlet processes to identify biclusters with strong co-occurrence in both rows and columns. The proposed method utilizes dual Dirichlet process mixture models to learn row and column clusters, with the number of resulting clusters determined by the data rather than pre-specified. Probabilistic biclusters are identified by modeling the mutual dependence between the row and column clusters. We apply our method to two different applications, text mining and gene expression analysis, and demonstrate that our method improves bicluster extraction in many settings compared to existing approaches

    A bi-ordering approach to linking gene expression with clinical annotations in gastric cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the study of cancer genomics, gene expression microarrays, which measure thousands of genes in a single assay, provide abundant information for the investigation of interesting genes or biological pathways. However, in order to analyze the large number of noisy measurements in microarrays, effective and efficient bioinformatics techniques are needed to identify the associations between genes and relevant phenotypes. Moreover, systematic tests are needed to validate the statistical and biological significance of those discoveries.</p> <p>Results</p> <p>In this paper, we develop a robust and efficient method for exploratory analysis of microarray data, which produces a number of different orderings (rankings) of both genes and samples (reflecting correlation among those genes and samples). The core algorithm is closely related to biclustering, and so we first compare its performance with several existing biclustering algorithms on two real datasets - gastric cancer and lymphoma datasets. We then show on the gastric cancer data that the sample orderings generated by our method are highly statistically significant with respect to the histological classification of samples by using the Jonckheere trend test, while the gene modules are biologically significant with respect to biological processes (from the Gene Ontology). In particular, some of the gene modules associated with biclusters are closely linked to gastric cancer tumorigenesis reported in previous literature, while others are potentially novel discoveries.</p> <p>Conclusion</p> <p>In conclusion, we have developed an effective and efficient method, Bi-Ordering Analysis, to detect informative patterns in gene expression microarrays by ranking genes and samples. In addition, a number of evaluation metrics were applied to assess both the statistical and biological significance of the resulting bi-orderings. The methodology was validated on gastric cancer and lymphoma datasets.</p
    corecore