72 research outputs found

    Simple SVM based whole-genome segmentation

    Get PDF
    We present a support vector machine (SVM) based framework for DNA segmentation into binary classes. Two applications are explored: transcription start site prediction and transcription factor binding prediction. Experiments demonstrate our approach has significantly better performance than other methods on both tasks

    Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Different microarray studies have compiled gene lists for predicting outcomes of a range of treatments and diseases. These have produced gene lists that have little overlap, indicating that the results from any one study are unstable. It has been suggested that the underlying pathways are essentially identical, and that the expression of gene sets, rather than that of individual genes, may be more informative with respect to prognosis and understanding of the underlying biological process.</p> <p>Results</p> <p>We sought to examine the stability of prognostic signatures based on gene sets rather than individual genes. We classified breast cancer cases from five microarray studies according to the risk of metastasis, using features derived from predefined gene sets. The expression levels of genes in the sets are aggregated, using what we call a set statistic. The resulting prognostic gene sets were as predictive as the lists of individual genes, but displayed more consistent rankings via bootstrap replications within datasets, produced more stable classifiers across different datasets, and are potentially more interpretable in the biological context since they examine gene expression in the context of their neighbouring genes in the pathway. In addition, we performed this analysis in each breast cancer molecular subtype, based on ER/HER2 status. The prognostic gene sets found in each subtype were consistent with the biology based on previous analysis of individual genes.</p> <p>Conclusions</p> <p>To date, most analyses of gene expression data have focused at the level of the individual genes. We show that a complementary approach of examining the data using predefined gene sets can reduce the noise and could provide increased insight into the underlying biological pathways.</p

    A bi-ordering approach to linking gene expression with clinical annotations in gastric cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the study of cancer genomics, gene expression microarrays, which measure thousands of genes in a single assay, provide abundant information for the investigation of interesting genes or biological pathways. However, in order to analyze the large number of noisy measurements in microarrays, effective and efficient bioinformatics techniques are needed to identify the associations between genes and relevant phenotypes. Moreover, systematic tests are needed to validate the statistical and biological significance of those discoveries.</p> <p>Results</p> <p>In this paper, we develop a robust and efficient method for exploratory analysis of microarray data, which produces a number of different orderings (rankings) of both genes and samples (reflecting correlation among those genes and samples). The core algorithm is closely related to biclustering, and so we first compare its performance with several existing biclustering algorithms on two real datasets - gastric cancer and lymphoma datasets. We then show on the gastric cancer data that the sample orderings generated by our method are highly statistically significant with respect to the histological classification of samples by using the Jonckheere trend test, while the gene modules are biologically significant with respect to biological processes (from the Gene Ontology). In particular, some of the gene modules associated with biclusters are closely linked to gastric cancer tumorigenesis reported in previous literature, while others are potentially novel discoveries.</p> <p>Conclusion</p> <p>In conclusion, we have developed an effective and efficient method, Bi-Ordering Analysis, to detect informative patterns in gene expression microarrays by ranking genes and samples. In addition, a number of evaluation metrics were applied to assess both the statistical and biological significance of the resulting bi-orderings. The methodology was validated on gastric cancer and lymphoma datasets.</p

    Vascular histone deacetylation by pharmacological HDAC inhibition

    Full text link
    HDAC inhibitors can regulate gene expression by post-translational modification of histone as well as nonhistone proteins. Often studied at single loci, increased histone acetylation is the paradigmatic mechanism of action. However, little is known of the extent of genome-wide changes in cells stimulated by the hydroxamic acids, TSA and SAHA. In this article, we map vascular chromatin modifications including histone H3 acetylation of lysine 9 and 14 (H3K9/14ac) using chromatin immunoprecipitation (ChIP) coupled with massive parallel sequencing (ChIP-seq). Since acetylation-mediated gene expression is often associated with modification of other lysine residues, we also examined H3K4me3 and H3K9me3 as well as changes in CpG methylation (CpG-seq). RNA sequencing indicates the differential expression of ∟30% of genes, with almost equal numbers being up- and down-regulated. We observed broad deacetylation and gene expression changes conferred by TSA and SAHA mediated by the loss of EP300/CREBBP binding at multiple gene promoters. This study provides an important framework for HDAC inhibitor function in vascular biology and a comprehensive description of genome-wide deacetylation by pharmacological HDAC inhibition

    is-rSNP: a novel technique for in silico regulatory SNP detection

    Get PDF
    Motivation: Determining the functional impact of non-coding disease-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) is challenging. Many of these SNPs are likely to be regulatory SNPs (rSNPs): variations which affect the ability of a transcription factor (TF) to bind to DNA. However, experimental procedures for identifying rSNPs are expensive and labour intensive. Therefore, in silico methods are required for rSNP prediction. By scoring two alleles with a TF position weight matrix (PWM), it can be determined which SNPs are likely rSNPs. However, predictions in this manner are noisy and no method exists that determines the statistical significance of a nucleotide variation on a PWM score

    Widespread FRA1-Dependent Control of Mesenchymal Transdifferentiation Programs in Colorectal Cancer Cells

    No full text
    Tumor invasion and metastasis involves complex remodeling of gene expression programs governing epithelial homeostasis. Mutational activation of the RAS-ERK is a frequent occurrence in many cancers and has been shown to drive overexpression of the AP-1 family transcription factor FRA1, a potent regulator of migration and invasion in a variety of tumor cell types. However, the nature of FRA1 transcriptional targets and the molecular pathways through which they promote tumor progression remain poorly understood. We found that FRA1 was strongly expressed in tumor cells at the invasive front of human colorectal cancers (CRCs), and that its depletion suppressed mesenchymal-like features in CRC cells in vitro. Genome-wide analysis of FRA1 chromatin occupancy and transcriptional regulation identified epithelial-mesenchymal transition (EMT)-related genes as a major class of direct FRA1 targets in CRC cells. Expression of the pro-mesenchymal subset of these genes predicted adverse outcomes in CRC patients, and involved FRA-1-dependent regulation and cooperation with TGFβ signaling pathway. Our findings reveal an unexpectedly widespread and direct role for FRA1 in control of epithelial-mesenchymal plasticity in CRC cells, and suggest that FRA1 plays an important role in mediating cross talk between oncogenic RAS-ERK and TGFβ signaling networks during tumor progression.This work was supported by project grants 1026228 and 1044168 (to A.S.D.) and Senior Research Fellowships (to R.D.H., R.B.P. and J.M.M.) from the National Health and Medical Research Council of Australia

    Appraisal of progenitor markers in the context of molecular classification of breast cancers

    Get PDF
    Clinical management of breast cancer relies on case stratification, which increasingly employs molecular markers. The motivation behind delineating breast epithelial differentiation is to better target cancer cases through innate sensitivities bequeathed to the cancer from its normal progenitor state. A combination of histopathological and molecular classification of breast cancer cases suggests a role for progenitors in particular breast cancer cases. Although a remarkable fraction of the real tissue repertoire is maintained within a population of independent cell line cultures, some steps that are closer to the terminal differentiation state and that form a majority of primary human breast tissues are missing in the cell line cultures. This raises concerns about current breast cancer models

    Meta-analysis of gene expression microarrays with missing replicates

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many different microarray experiments are publicly available today. It is natural to ask whether different experiments for the same phenotypic conditions can be combined using meta-analysis, in order to increase the overall sample size. However, some genes are not measured in all experiments, hence they cannot be included or their statistical significance cannot be appropriately estimated in traditional meta-analysis. Nonetheless, these genes, which we refer to as <it>incomplete genes</it>, may also be informative and useful.</p> <p>Results</p> <p>We propose a meta-analysis framework, called "Incomplete Gene Meta-analysis", which can include incomplete genes by imputing the significance of missing replicates, and computing a meta-score for every gene across all datasets. We demonstrate that the incomplete genes are worthy of being included and our method is able to appropriately estimate their significance in two groups of experiments. We first apply the <it>Incomplete Gene Meta-analysis </it>and several comparable methods to five breast cancer datasets with an identical set of probes. We simulate incomplete genes by randomly removing a subset of probes from each dataset and demonstrate that our method consistently outperforms two other methods in terms of their false discovery rate. We also apply the methods to three gastric cancer datasets for the purpose of discriminating diffuse and intestinal subtypes.</p> <p>Conclusions</p> <p>Meta-analysis is an effective approach that identifies more robust sets of differentially expressed genes from multiple studies. The incomplete genes that mainly arise from the use of different platforms may also have statistical and biological importance but are ignored or are not appropriately involved by previous studies. Our Incomplete Gene Meta-analysis is able to incorporate the incomplete genes by estimating their significance. The results on both breast and gastric cancer datasets suggest that the highly ranked genes and associated GO terms produced by our method are more significant and biologically meaningful according to the previous literature.</p

    Epigenetic Regulation of Cell Type–Specific Expression Patterns in the Human Mammary Epithelium

    Get PDF
    Differentiation is an epigenetic program that involves the gradual loss of pluripotency and acquisition of cell type–specific features. Understanding these processes requires genome-wide analysis of epigenetic and gene expression profiles, which have been challenging in primary tissue samples due to limited numbers of cells available. Here we describe the application of high-throughput sequencing technology for profiling histone and DNA methylation, as well as gene expression patterns of normal human mammary progenitor-enriched and luminal lineage-committed cells. We observed significant differences in histone H3 lysine 27 tri-methylation (H3K27me3) enrichment and DNA methylation of genes expressed in a cell type–specific manner, suggesting their regulation by epigenetic mechanisms and a dynamic interplay between the two processes that together define developmental potential. The technologies we developed and the epigenetically regulated genes we identified will accelerate the characterization of primary cell epigenomes and the dissection of human mammary epithelial lineage-commitment and luminal differentiation
    • …
    corecore