920 research outputs found

    Bimodal gene expression and biomarker discovery.

    Get PDF
    With insights gained through molecular profiling, cancer is recognized as a heterogeneous disease with distinct subtypes and outcomes that can be predicted by a limited number of biomarkers. Statistical methods such as supervised classification and machine learning identify distinguishing features associated with disease subtype but are not necessarily clear or interpretable on a biological level. Genes with bimodal transcript expression, however, may serve as excellent candidates for disease biomarkers with each mode of expression readily interpretable as a biological state. The recent article by Wang et al, entitled The Bimodality Index: A Criterion for Discovering and Ranking Bimodal Signatures from Cancer Gene Expression Profiling Data, provides a bimodality index for identifying and scoring transcript expression profiles as biomarker candidates with the benefit of having a direct relation to power and sample size. This represents an important step in candidate biomarker discovery that may help streamline the pipeline through validation and clinical application

    Human and mouse switch-like genes share common transcriptional regulatory mechanisms for bimodality

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene expression is controlled over a wide range at the transcript level through complex interplay between DNA and regulatory proteins, resulting in profiles of gene expression that can be represented as normal, graded, and bimodal (switch-like) distributions. We have previously performed genome-scale identification and annotation of genes with switch-like expression at the transcript level in mouse, using large microarray datasets for healthy tissue, in order to study the cellular pathways and regulatory mechanisms involving this class of genes. We showed that a large population of bimodal mouse genes encoding for cell membrane and extracellular matrix proteins is involved in communication pathways. This study expands on previous results by annotating human bimodal genes, investigating their correspondence to bimodality in mouse orthologs and exploring possible regulatory mechanisms that contribute to bimodality in gene expression in human and mouse.</p> <p>Results</p> <p>Fourteen percent of the human genes on the HGU133A array (1847 out of 13076) were identified as bimodal or switch-like. More than 40% were found to have bimodal mouse orthologs. KEGG pathways enriched for bimodal genes included ECM-receptor interaction, focal adhesion, and tight junction, showing strong similarity to the results obtained in mouse. Tissue-specific modes of expression of bimodal genes among brain, heart, and skeletal muscle were common between human and mouse. Promoter analysis revealed a higher than average number of transcription start sites per gene within the set of bimodal genes. Moreover, the bimodal gene set had differentially methylated histones compared to the set of the remaining genes in the genome.</p> <p>Conclusion</p> <p>The fact that bimodal genes were enriched within the cell membrane and extracellular environment make these genes as candidates for biomarkers for tissue specificity. The commonality of the important roles bimodal genes play in tissue differentiation in both the human and mouse indicates the potential value of mouse data in providing context for human tissue studies. The regulation motifs enriched in the bimodal gene set (TATA boxes, alternative promoters, methlyation) have known associations with complex diseases, such as cancer, providing further potential for the use of bimodal genes in studying the molecular basis of disease.</p

    iSeqQC: a tool for expression-based quality control in RNA sequencing.

    Get PDF
    BACKGROUND: Quality Control in any high-throughput sequencing technology is a critical step, which if overlooked can compromise an experiment and the resulting conclusions. A number of methods exist to identify biases during sequencing or alignment, yet not many tools exist to interpret biases due to outliers. RESULTS: Hence, we developed iSeqQC, an expression-based QC tool that detects outliers either produced due to variable laboratory conditions or due to dissimilarity within a phenotypic group. iSeqQC implements various statistical approaches including unsupervised clustering, agglomerative hierarchical clustering and correlation coefficients to provide insight into outliers. It can be utilized through command-line (Github: https://github.com/gkumar09/iSeqQC) or web-interface (http://cancerwebpa.jefferson.edu/iSeqQC). A local shiny installation can also be obtained from github (https://github.com/gkumar09/iSeqQC). CONCLUSION: iSeqQC is a fast, light-weight, expression-based QC tool that detects outliers by implementing various statistical approaches

    Switch-like genes populate cell communication pathways and are enriched for extracellular proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent studies have placed gene expression in the context of distribution profiles including housekeeping, graded, and bimodal (switch-like). Single-gene studies have shown bimodal expression results from healthy cell signaling and complex diseases such as cancer, however developing a comprehensive list of human bimodal genes has remained a major challenge due to inherent noise in human microarray data. This study presents a two-component mixture analysis of mouse gene expression data for genes on the Affymetrix MG-U74Av2 array for the detection and annotation of switch-like genes. Two-component normal mixtures were fit to the data to identify bimodal genes and their potential roles in cell signaling and disease progression.</p> <p>Results</p> <p>Seventeen percent of the genes on the MG-U74Av2 array (1519 out of 9091) were identified as bimodal or switch-like. KEGG pathways significantly enriched for bimodal genes included ECM-receptor interaction, cell communication, and focal adhesion. Similarly, the GO biological process "cell adhesion" and cellular component "extracellular matrix" were significantly enriched. Switch-like genes were found to be associated with such diseases as congestive heart failure, Alzheimer's disease, arteriosclerosis, breast neoplasms, hypertension, myocardial infarction, obesity, rheumatoid arthritis, and type I and type II diabetes. In diabetes alone, over two hundred bimodal genes were in a different mode of expression compared to normal tissue.</p> <p>Conclusion</p> <p>This research identified and annotated bimodal or switch-like genes in the mouse genome using a large collection of microarray data. Genes with bimodal expression were enriched within the cell membrane and extracellular environment. Hundreds of bimodal genes demonstrated alternate modes of expression in diabetic muscle, pancreas, liver, heart, and adipose tissue. Bimodal genes comprise a candidate set of biomarkers for a large number of disease states because their expressions are tightly regulated at the transcription level.</p

    RB loss contributes to aggressive tumor phenotypes in MYC-driven triple negative breast cancer

    Get PDF
    Triple negative breast cancer (TNBC) is characterized by multiple genetic events occurring in concert to drive pathogenic features of the disease. Here we interrogated the coordinate impact of p53, RB, and MYC in a genetic model of TNBC, in parallel with the analysis of clinical specimens. Primary mouse mammary epithelial cells (mMEC) with defined genetic features were used to delineate the combined action of RB and/or p53 in the genesis of TNBC. In this context, the deletion of either RB or p53 alone and in combination increased the proliferation of mMEC; however, the cells did not have the capacity to invade in matrigel. Gene expression profiling revealed that loss of each tumor suppressor has effects related to proliferation, but RB loss in particular leads to alterations in gene expression associated with the epithelial-to-mesenchymal transition. The overexpression of MYC in combination with p53 loss or combined RB/p53 loss drove rapid cell growth. While the effects of MYC overexpression had a dominant impact on gene expression, loss of RB further enhanced the deregulation of a gene expression signature associated with invasion. Specific RB loss lead to enhanced invasion in boyden chambers assays and gave rise to tumors with minimal epithelial characteristics relative to RB-proficient models. Therapeutic screening revealed that RB-deficient cells were particularly resistant to agents targeting PI3K and MEK pathway. Consistent with the aggressive behavior of the preclinical models of MYC overexpression and RB loss, human TNBC tumors that express high levels of MYC and are devoid of RB have a particularly poor outcome. Together these results underscore the potency of tumor suppressor pathways in specifying the biology of breast cancer. Further, they demonstrate that MYC overexpression in concert with RB can promote a particularly aggressive form of TNB

    Annotation and function of switch-like genes in health and disease

    Get PDF
    Gene expression microarrays provide transcript-level measurements across entire genomes and are traditionally used for differential expression analysis between health and disease or classification of disease subtypes. The abundance of gene expression microarray data currently available to the scientific community makes it possible to assess gene transcript levels among diverse tissue types for an entire genome. Gene expression is controlled over a wide range at the transcript level through complex interplay between DNA and regulatory proteins, resulting in gene expression profiles that can be represented as normal, graded, and bimodal (switch-like) distributions. It is our assertion that these distributions of gene expression, notably the bimodal distribution, result from biologically relevant regulation events. We have performed genome-scale identification and annotation of genes with bimodal, switch-like expression at the transcript level in human and mouse, using large microarray datasets for healthy tissue, in order to study the cellular pathways and regulatory mechanisms involving this class of genes. Our method implemented a likelihood ratio test to identify bimodal genes by comparing the best-fit two-component normal mixture, estimated using the expectation maximization algorithm, against a single-component normal distribution for each gene. This procedure identified roughly 15% of genes in human and mouse as bimodal, with a substantial overlap between human genes and their orthologous mouse counterparts. A survey of biological pathways revealed that the set of bimodal genes plays a role in cell communication and signaling with the external environment. Our analysis of regulatory sequence regions for bimodal genes revealed characteristics including enrichment of TATA boxes and an increased number of alternative transcription start sites. In addition to regulatory sequence analysis, we explored aspects of epigenetic regulation for their activity among the set of bimodal genes. We performed meta-analysis of gene expression microarray, DNA methylation, and histone methylation datasets representing human stem cells and liver tissue to reveal that the mode of expression within switch-like genes is primarily associated with histone methylation status. These results provide insight to normal patterns of histone methylation in healthy, differentiated tissue types. Aberrant methylation is a known marker in the progression of cancer, so these switch-like genes may also provide a valuable reference in disease diagnosis and prognosis. The method presented for bimodal gene identification also allows for an alternate approach to differential gene expression analysis between tissues and disease subtypes.Ph.D., Biomedical Engineering -- Drexel University, 200

    Annotation and function of switch-like genes in health and disease

    Get PDF
    Gene expression microarrays provide transcript-level measurements across entire genomes and are traditionally used for differential expression analysis between health and disease or classification of disease subtypes. The abundance of gene expression microarray data currently available to the scientific community makes it possible to assess gene transcript levels among diverse tissue types for an entire genome. Gene expression is controlled over a wide range at the transcript level through complex interplay between DNA and regulatory proteins, resulting in gene expression profiles that can be represented as normal, graded, and bimodal (switch-like) distributions. It is our assertion that these distributions of gene expression, notably the bimodal distribution, result from biologically relevant regulation events. We have performed genome-scale identification and annotation of genes with bimodal, switch-like expression at the transcript level in human and mouse, using large microarray datasets for healthy tissue, in order to study the cellular pathways and regulatory mechanisms involving this class of genes. Our method implemented a likelihood ratio test to identify bimodal genes by comparing the best-fit two-component normal mixture, estimated using the expectation maximization algorithm, against a single-component normal distribution for each gene. This procedure identified roughly 15% of genes in human and mouse as bimodal, with a substantial overlap between human genes and their orthologous mouse counterparts. A survey of biological pathways revealed that the set of bimodal genes plays a role in cell communication and signaling with the external environment. Our analysis of regulatory sequence regions for bimodal genes revealed characteristics including enrichment of TATA boxes and an increased number of alternative transcription start sites. In addition to regulatory sequence analysis, we explored aspects of epigenetic regulation for their activity among the set of bimodal genes. We performed meta-analysis of gene expression microarray, DNA methylation, and histone methylation datasets representing human stem cells and liver tissue to reveal that the mode of expression within switch-like genes is primarily associated with histone methylation status. These results provide insight to normal patterns of histone methylation in healthy, differentiated tissue types. Aberrant methylation is a known marker in the progression of cancer, so these switch-like genes may also provide a valuable reference in disease diagnosis and prognosis. The method presented for bimodal gene identification also allows for an alternate approach to differential gene expression analysis between tissues and disease subtypes.Ph.D., Biomedical Engineering -- Drexel University, 200

    Genome-wide redistribution of MeCP2 in dorsal root ganglia after peripheral nerve injury.

    Get PDF
    BACKGROUND: Methyl-CpG-binding protein 2 (MeCP2), a protein with affinity for methylated cytosines, is crucial for neuronal development and function. MeCP2 regulates gene expression through activation, repression and chromatin remodeling. Mutations in MeCP2 cause Rett syndrome, and these patients display impaired nociception. We observed an increase in MeCP2 expression in mouse dorsal root ganglia (DRG) after peripheral nerve injury. The functional implication of increased MeCP2 is largely unknown. To identify regions of the genome bound by MeCP2 in the DRG and the changes induced by nerve injury, a chromatin immunoprecipitation of MeCP2 followed by sequencing (ChIP-seq) was performed 4 weeks after spared nerve injury (SNI). RESULTS: While the number of binding sites across the genome remained similar in the SNI model and sham control, SNI induced the redistribution of MeCP2 to transcriptionally relevant regions. To determine how differential binding of MeCP2 can affect gene expression in the DRG, we investigated mmu-miR-126, a microRNA locus that had enriched MeCP2 binding in the SNI model. Enriched MeCP2 binding to miR-126 locus after nerve injury repressed miR-126 expression, and this was not mediated by alterations in methylation pattern at the miR-126 locus. Downregulation of miR-126 resulted in the upregulation of its two target genes Dnmt1 and Vegfa in Neuro 2A cells and in SNI model compared to control. These target genes were significantly downregulated in Mecp2-null mice compared to wild-type littermates, indicating a regulatory role for MeCP2 in activating Dnmt1 and Vegfa expression. Intrathecal delivery of miR-126 was not sufficient to reverse nerve injury-induced mechanical and thermal hypersensitivity, but decreased Dnmt1 and Vegfa expression in the DRG. CONCLUSIONS: Our study shows a regulatory role for MeCP2 in that changes in global redistribution can result in direct and indirect modulation of gene expression in the DRG. Alterations in genome-wide binding of MeCP2 therefore provide a molecular basis for a better understanding of epigenetic regulation-induced molecular changes underlying nerve injury

    Prediction potential of candidate biomarker sets identified and validated on gene expression data from multiple datasets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Independently derived expression profiles of the same biological condition often have few genes in common. In this study, we created populations of expression profiles from publicly available microarray datasets of cancer (breast, lymphoma and renal) samples linked to clinical information with an iterative machine learning algorithm. ROC curves were used to assess the prediction error of each profile for classification. We compared the prediction error of profiles correlated with molecular phenotype against profiles correlated with relapse-free status. Prediction error of profiles identified with supervised univariate feature selection algorithms were compared to profiles selected randomly from a) all genes on the microarray platform and b) a list of known disease-related genes (a priori selection). We also determined the relevance of expression profiles on test arrays from independent datasets, measured on either the same or different microarray platforms.</p> <p>Results</p> <p>Highly discriminative expression profiles were produced on both simulated gene expression data and expression data from breast cancer and lymphoma datasets on the basis of ER and BCL-6 expression, respectively. Use of relapse-free status to identify profiles for prognosis prediction resulted in poorly discriminative decision rules. Supervised feature selection resulted in more accurate classifications than random or a priori selection, however, the difference in prediction error decreased as the number of features increased. These results held when decision rules were applied across-datasets to samples profiled on the same microarray platform.</p> <p>Conclusion</p> <p>Our results show that many gene sets predict molecular phenotypes accurately. Given this, expression profiles identified using different training datasets should be expected to show little agreement. In addition, we demonstrate the difficulty in predicting relapse directly from microarray data using supervised machine learning approaches. These findings are relevant to the use of molecular profiling for the identification of candidate biomarker panels.</p
    • …
    corecore