4,101 research outputs found

    Reproducible probe-level analysis of the Affymetrix Exon 1.0 ST array with R/Bioconductor

    Full text link
    The presence of different transcripts of a gene across samples can be analysed by whole-transcriptome microarrays. Reproducing results from published microarray data represents a challenge due to the vast amounts of data and the large variety of pre-processing and filtering steps employed before the actual analysis is carried out. To guarantee a firm basis for methodological development where results with new methods are compared with previous results it is crucial to ensure that all analyses are completely reproducible for other researchers. We here give a detailed workflow on how to perform reproducible analysis of the GeneChip Human Exon 1.0 ST Array at probe and probeset level solely in R/Bioconductor, choosing packages based on their simplicity of use. To exemplify the use of the proposed workflow we analyse differential splicing and differential gene expression in a publicly available dataset using various statistical methods. We believe this study will provide other researchers with an easy way of accessing gene expression data at different annotation levels and with the sufficient details needed for developing their own tools for reproducible analysis of the GeneChip Human Exon 1.0 ST Array

    Noncoder : a web interface for exon array-based detection of long non-coding RNAs

    Get PDF
    Due to recent technical developments, a high number of long non-coding RNAs (lncRNAs) have been discovered in mammals. Although it has been shown that lncRNAs are regulated differently among tissues and disease statuses, functions of these transcripts are still unknown in most cases. GeneChip Exon 1.0 ST Arrays (exon arrays) from Affymetrix, Inc. have been used widely to profile genome-wide expression changes and alternative splicing of protein-coding genes. Here, we demonstrate that re-annotation of exon array probes can be used to profile expressions of tens of thousands of lncRNAs. With this annotation, a detailed inspection of lncRNAs and their isoforms is possible. To allow for a general usage to the research community, we developed a user-friendly web interface called 'noncoder'. By uploading CEL files from exon arrays and with a few mouse clicks and parameter settings, exon array data will be normalized and analysed to identify differentially expressed lncRNAs. Noncoder provides the detailed annotation information of lncRNAs and is equipped with unique features to allow for an efficient search for interesting lncRNAs to be studied further. The web interface is available at http://noncoder.mpi-bn.mpg.de

    Normalized Affymetrix expression data are biased by G-quadruplex formation

    Get PDF
    Probes with runs of four or more guanines (G-stacks) in their sequences can exhibit a level of hybridization that is unrelated to the expression levels of the mRNA that they are intended to measure. This is most likely caused by the formation of G-quadruplexes, where inter-probe guanines form Hoogsteen hydrogen bonds, which probes with G-stacks are capable of forming. We demonstrate that for a specific microarray data set using the Human HG-U133A Affymetrix GeneChip and RMA normalization there is significant bias in the expression levels, the fold change and the correlations between expression levels. These effects grow more pronounced as the number of G-stack probes in a probe set increases. Approximately 14 of the probe sets are directly affected. The analysis was repeated for a number of other normalization pipelines and two, FARMS and PLIER, minimized the bias to some extent. We estimate that ∼15 of the data sets deposited in the GEO database are susceptible to the effect. The inclusion of G-stack probes in the affected data sets can bias key parameters used in the selection and clustering of genes. The elimination of these probes from any analysis in such affected data sets outweighs the increase of noise in the signal. © 2011 The Author(s)

    A gene signature for post-infectious chronic fatigue syndrome

    Get PDF
    Background: At present, there are no clinically reliable disease markers for chronic fatigue syndrome. DNA chip microarray technology provides a method for examining the differential expression of mRNA from a large number of genes. Our hypothesis was that a gene expression signature, generated by microarray assays, could help identify genes which are dysregulated in patients with post-infectious CFS and so help identify biomarkers for the condition. Methods: Human genome-wide Affymetrix GeneChip arrays (39,000 transcripts derived from 33,000 gene sequences) were used to compare the levels of gene expression in the peripheral blood mononuclear cells of male patients with post-infectious chronic fatigue (n = 8) and male healthy control subjects (n = 7). Results: Patients and healthy subjects differed significantly in the level of expression of 366 genes. Analysis of the differentially expressed genes indicated functional implications in immune modulation, oxidative stress and apoptosis. Prototype biomarkers were identified on the basis of differential levels of gene expression and possible biological significance Conclusion: Differential expression of key genes identified in this study offer an insight into the possible mechanism of chronic fatigue following infection. The representative biomarkers identified in this research appear promising as potential biomarkers for diagnosis and treatment

    Identification of transcriptional regulatory networks specific to pilocytic astrocytoma.

    Get PDF
    BackgroundPilocytic Astrocytomas (PAs) are common low-grade central nervous system malignancies for which few recurrent and specific genetic alterations have been identified. In an effort to better understand the molecular biology underlying the pathogenesis of these pediatric brain tumors, we performed higher-order transcriptional network analysis of a large gene expression dataset to identify gene regulatory pathways that are specific to this tumor type, relative to other, more aggressive glial or histologically distinct brain tumours.MethodsRNA derived from frozen human PA tumours was subjected to microarray-based gene expression profiling, using Affymetrix U133Plus2 GeneChip microarrays. This data set was compared to similar data sets previously generated from non-malignant human brain tissue and other brain tumour types, after appropriate normalization.ResultsIn this study, we examined gene expression in 66 PA tumors compared to 15 non-malignant cortical brain tissues, and identified 792 genes that demonstrated consistent differential expression between independent sets of PA and non-malignant specimens. From this entire 792 gene set, we used the previously described PAP tool to assemble a core transcriptional regulatory network composed of 6 transcription factor genes (TFs) and 24 target genes, for a total of 55 interactions. A similar analysis of oligodendroglioma and glioblastoma multiforme (GBM) gene expression data sets identified distinct, but overlapping, networks. Most importantly, comparison of each of the brain tumor type-specific networks revealed a network unique to PA that included repressed expression of ONECUT2, a gene frequently methylated in other tumor types, and 13 other uniquely predicted TF-gene interactions.ConclusionsThese results suggest specific transcriptional pathways that may operate to create the unique molecular phenotype of PA and thus opportunities for corresponding targeted therapeutic intervention. Moreover, this study also demonstrates how integration of gene expression data with TF-gene and TF-TF interaction data is a powerful approach to generating testable hypotheses to better understand cell-type specific genetic programs relevant to cancer

    Micro-Environment Causes Reversible Changes in DNA Methylation and mRNA Expression Profiles in Patient-Derived Glioma Stem Cells

    Get PDF
    In vitro and in vivo models are widely used in cancer research. Characterizing the similarities and differences between a patient\u27s tumor and corresponding in vitro and in vivo models is important for understanding the potential clinical relevance of experimental data generated with these models. Towards this aim, we analyzed the genomic aberrations, DNA methylation and transcriptome profiles of five parental tumors and their matched in vitro isolated glioma stem cell (GSC) lines and xenografts generated from these same GSCs using high-resolution platforms. We observed that the methylation and transcriptome profiles of in vitro GSCs were significantly different from their corresponding xenografts, which were actually more similar to their original parental tumors. This points to the potentially critical role of the brain microenvironment in influencing methylation and transcriptional patterns of GSCs. Consistent with this possibility, ex vivo cultured GSCs isolated from xenografts showed a tendency to return to their initial in vitro states even after a short time in culture, supporting a rapid dynamic adaptation to the in vitro microenvironment. These results show that methylation and transcriptome profiles are highly dependent on the microenvironment and growth in orthotopic sites partially reverse the changes caused by in vitro culturing

    A powerful method for detecting differentially expressed genes from GeneChip arrays that does not require replicates

    Get PDF
    BACKGROUND: Studies of differential expression that use Affymetrix GeneChip arrays are often carried out with a limited number of replicates. Reasons for this include financial considerations and limits on the available amount of RNA for sample preparation. In addition, failed hybridizations are not uncommon leading to a further reduction in the number of replicates available for analysis. Most existing methods for studying differential expression rely on the availability of replicates and the demand for alternative methods that require few or no replicates is high. RESULTS: We describe a statistical procedure for performing differential expression analysis without replicates. The procedure relies on a Bayesian integrated approach (BGX) to the analysis of Affymetrix GeneChips. The BGX method estimates a posterior distribution of expression for each gene and condition, from a simultaneous consideration of the available probe intensities representing the gene in a condition. Importantly, posterior distributions of expression are obtained regardless of the number of replicates available. We exploit these posterior distributions to create ranked gene lists that take into account the estimated expression difference as well as its associated uncertainty. We estimate the proportion of non-differentially expressed genes empirically, allowing an informed choice of cut-off for the ranked gene list, adapting an approach proposed by Efron. We assess the performance of the method, and compare it to those of other methods, on publicly available spike-in data sets, as well as in a proper biological setting. CONCLUSION: The method presented is a powerful tool for extracting information on differential expression from GeneChip expression studies with limited or no replicates

    Fisher's combined p-value for detecting differentially expressed genes using Affymetrix expression arrays

    Get PDF
    BACKGROUND: Currently, most tests of differential gene expression using Affymetrix expression array data are performed using expression summary values representing each probe set on a microarray. Recently testing methods have been proposed which incorporate probe level information. We propose a new approach that uses Fisher's method of combining evidence from multiple sources of information. Specifically, we combine p-values from probe level tests of significance. RESULTS: The combined p method and other competing methods were compared using three spike-in datasets (where probe sets corresponding differentially spiked transcripts are known) and array data from a biological study validated with qRT-PCR. Based on power and false discovery rates computed for the spike-in datasets, we demonstrate that the combined p method compares favorably with other methods. We find that probe level testing methods select many of the same genes as differentially expressed. We illustrate the use of the combined p method for diagnostic purposes using examples. CONCLUSION: Combined p is a promising alternative to existing methods of testing for differential gene expression. In addition, the combined p method is particularly well suited as a diagnostic tool for exploratory analysis of microarray data

    Transcriptome pathways unique to dehydration tolerant relatives of modern wheat

    Get PDF
    Among abiotic stressors, drought is a major factor responsible for dramatic yield loss in agriculture. In order to reveal differences in global expression profiles of drought tolerant and sensitive wild emmer wheat genotypes, a previously deployed shock-like dehydration process was utilized to compare transcriptomes at two time points in root and leaf tissues using the Affymetrix GeneChip(R) Wheat Genome Array hybridization. The comparison of transcriptomes reveal several unique genes or expression patterns such as differential usage of IP(3)-dependent signal transduction pathways, ethylene- and abscisic acid (ABA)-dependent signaling, and preferential or faster induction of ABA-dependent transcription factors by the tolerant genotype that distinguish contrasting genotypes indicative of distinctive stress response pathways. The data also show that wild emmer wheat is capable of engaging known drought stress responsive mechanisms. The global comparison of transcriptomes in the absence of and after dehydration underlined the gene networks especially in root tissues that may have been lost in the selection processes generating modern bread wheats

    Performance evaluation of commercial short-oligonucleotide microarrays and the impact of noise in making cross-platform correlations

    Get PDF
    BACKGROUND: Despite the widespread use of microarrays, much ambiguity regarding data analysis, interpretation and correlation of the different technologies exists. There is a considerable amount of interest in correlating results obtained between different microarray platforms. To date, only a few cross-platform evaluations have been published and unfortunately, no guidelines have been established on the best methods of making such correlations. To address this issue we conducted a thorough evaluation of two commercial microarray platforms to determine an appropriate methodology for making cross-platform correlations. RESULTS: In this study, expression measurements for 10,763 genes uniquely represented on Affymetrix U133A/B GeneChips(® )and Amersham CodeLink™ UniSet Human 20 K microarrays were compared. For each microarray platform, five technical replicates, derived from the same total RNA samples, were labeled, hybridized, and quantified according to each manufacturers' standard protocols. The correlation coefficient (r) of differential expression ratios for the entire set of 10,763 overlapping genes was 0.62 between platforms. However, the correlation improved significantly (r = 0.79) when genes within noise were excluded. In addition to levels of inter-platform correlation, we evaluated precision, statistical-significance profiles, power, and noise levels for each microarray platform. Accuracy of differential expression was measured against real-time PCR for 25 genes and both platforms correlated well with r values of 0.92 and 0.79 for CodeLink and GeneChip, respectively. CONCLUSIONS: As a result of this study, we recommend using only genes called 'present' in cross-platform correlations. However, as in this study, a large number of genes may be lost from the correlation due to differing levels of noise between platforms. This is an important consideration given the apparent difference in sensitivity of the two platforms. Data from microarray analysis need to be interpreted cautiously and therefore, we provide guidelines for making cross-platform correlations. In all, this study represents the most comprehensive and specifically designed comparison of short-oligonucleotide microarray platforms to date using the largest set of overlapping genes
    corecore