80 research outputs found
Spike-in validation of an Illumina-specific variance-stabilizing transformation
BACKGROUND: Variance-stabilizing techniques have been used for some time in the analysis of gene expression microarray data. A new adaptation, the variance-stabilizing transformation (VST), has recently been developed to take advantage of the unique features of Illumina BeadArrays. VST has been shown to perform well in comparison with the widely-used approach of taking a log2 transformation, but has not been validated on a spike-in experiment. We apply VST to the data from a recently published spike-in experiment and compare it both to a regular log2 analysis and a recently recommended analysis that can be applied if all raw data are available. FINDINGS: VST provides more power to detect differentially expressed genes than a log2 transformation. However, the gain in power is roughly the same as utilizing the raw data from an experiment and weighting observations accordingly. VST is still advantageous when large changes in expression are anticipated, while a weighted log2 approach performs better for smaller changes. CONCLUSION: VST can be recommended for summarized Illumina data regardless of which Illumina pre-processing options have been used. However, using the raw data is still encouraged whenever possible
Recommended from our members
Genome-Wide Identification of Functionally Distinct Subsets of Cellular mRNAs Associated with Two Nucleocytoplasmic-Shuttling Mammalian Splicing Factors
Background: Pre-mRNA splicing is an essential step in gene expression that occurs co-transcriptionally in the cell nucleus, involving a large number of RNA binding protein splicing factors, in addition to core spliceosome components. Several of these proteins are required for the recognition of intronic sequence elements, transiently associating with the primary transcript during splicing. Some protein splicing factors, such as the U2 small nuclear RNP auxiliary factor (U2AF), are known to be exported to the cytoplasm, despite being implicated solely in nuclear functions. This observation raises the question of whether U2AF associates with mature mRNA-ribonucleoprotein particles in transit to the cytoplasm, participating in additional cellular functions. Results: Here we report the identification of RNAs immunoprecipitated by a monoclonal antibody specific for the U2AF 65 kDa subunit (U2AF) and demonstrate its association with spliced mRNAs. For comparison, we analyzed mRNAs associated with the polypyrimidine tract binding protein (PTB), a splicing factor that also binds to intronic pyrimidine-rich sequences but additionally participates in mRNA localization, stability, and translation. Our results show that 10% of cellular mRNAs expressed in HeLa cells associate differentially with U2AF and PTB. Among U2AF -associated mRNAs there is a predominance of transcription factors and cell cycle regulators, whereas PTB-associated transcripts are enriched in mRNA species that encode proteins implicated in intracellular transport, vesicle trafficking, and apoptosis. Conclusion: Our results show that U2AF associates with specific subsets of spliced mRNAs, strongly suggesting that it is involved in novel cellular functions in addition to splicing
Recommended from our members
Expression microarray reproducibility is improved by optimising purification steps in RNA amplification and labelling.
BACKGROUND: Expression microarrays have evolved into a powerful tool with great potential for clinical application and therefore reliability of data is essential. RNA amplification is used when the amount of starting material is scarce, as is frequently the case with clinical samples. Purification steps are critical in RNA amplification and labelling protocols, and there is a lack of sufficient data to validate and optimise the process. RESULTS: Here the purification steps involved in the protocol for indirect labelling of amplified RNA are evaluated and the experimentally determined best method for each step with respect to yield, purity, size distribution of the transcripts, and dye coupling is used to generate targets tested in replicate hybridisations. DNase treatment of diluted total RNA samples followed by phenol extraction is the optimal way to remove genomic DNA contamination. Purification of double-stranded cDNA is best achieved by phenol extraction followed by isopropanol precipitation at room temperature. Extraction with guanidinium-phenol and Lithium Chloride precipitation are the optimal methods for purification of amplified RNA and labelled aRNA respectively. CONCLUSION: This protocol provides targets that generate highly reproducible microarray data with good representation of transcripts across the size spectrum and a coefficient of repeatability significantly better than that reported previously.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data.
Illumina BeadArrays are among the most popular and reliable platforms for gene expression profiling. However, little external scrutiny has been given to the design, selection and annotation of BeadArray probes, which is a fundamental issue in data quality and interpretation. Here we present a pipeline for the complete genomic and transcriptomic re-annotation of Illumina probe sequences, also applicable to other platforms, with its output available through a Web interface and incorporated into Bioconductor packages. We have identified several problems with the design of individual probes and we show the benefits of probe re-annotation on the analysis of BeadArray gene expression data sets. We discuss the importance of aspects such as probe coverage of individual transcripts, alternative messenger RNA splicing, single-nucleotide polymorphisms, repeat sequences, RNA degradation biases and probes targeting genomic regions with no known transcription. We conclude that many of the Illumina probes have unreliable original annotation and that our re-annotation allows analyses to focus on the good quality probes, which form the majority, and also to expand the scope of biological information that can be extracted
A consensus prognostic gene expression classifier for ER positive breast cancer.
BACKGROUND: A consensus prognostic gene expression classifier is still elusive in heterogeneous diseases such as breast cancer. RESULTS: Here we perform a combined analysis of three major breast cancer microarray data sets to hone in on a universally valid prognostic molecular classifier in estrogen receptor (ER) positive tumors. Using a recently developed robust measure of prognostic separation, we further validate the prognostic classifier in three external independent cohorts, confirming the validity of our molecular classifier in a total of 877 ER positive samples. Furthermore, we find that molecular classifiers may not outperform classical prognostic indices but that they can be used in hybrid molecular-pathological classification schemes to improve prognostic separation. CONCLUSION: The prognostic molecular classifier presented here is the first to be valid in over 877 ER positive breast cancer samples and across three different microarray platforms. Larger multi-institutional studies will be needed to fully determine the added prognostic value of molecular classifiers when combined with standard prognostic factors
Latent regulatory potential of human-specific repetitive elements
At least half of the human genome is derived from repetitive elements, which are often lineage specific and silenced by a variety of genetic and epigenetic mechanisms. Using a transchromosomic mouse strain that transmits an almost complete single copy of human chromosome 21 via the female germline, we show that a heterologous regulatory environment can transcriptionally activate transposon-derived human regulatory regions. In the mouse nucleus, hundreds of locations on human chromosome 21 newly associate with activating histone modifications in both somatic and germline tissues, and influence the gene expression of nearby transcripts. These regions are enriched with primate and human lineage-specific transposable elements, and their activation corresponds to changes in DNA methylation at CpG dinucleotides. This study reveals the latent regulatory potential of the repetitive human genome and illustrates the species specificity of mechanisms that control it
The splicing factor XAB2 interacts with ERCC1-XPF and XPG for R-loop processing
RNA splicing, transcription and the DNA damage response are intriguingly linked in mammals but the underlying mechanisms remain poorly understood. Using an in vivo biotinylation tagging approach in mice, we show that the splicing factor XAB2 interacts with the core spliceosome and that it binds to spliceosomal U4 and U6 snRNAs and pre-mRNAs in developing livers. XAB2 depletion leads to aberrant intron retention, R-loop formation and DNA damage in cells. Studies in illudin S-treated cells and Csb(m/m) developing livers reveal that transcription-blocking DNA lesions trigger the release of XAB2 from all RNA targets tested. Immunoprecipitation studies reveal that XAB2 interacts with ERCC1-XPF and XPG endonucleases outside nucleotide excision repair and that the trimeric protein complex binds RNA:DNA hybrids under conditions that favor the formation of R-loops. Thus, XAB2 functionally links the spliceosomal response to DNA damage with R-loop processing with important ramifications for transcription-coupled DNA repair disorders
Recommended from our members
MicroRNA expression profiling of human breast cancer identifies new markers of tumor subtype.
BACKGROUND: MicroRNAs (miRNAs), a class of short non-coding RNAs found in many plants and animals, often act post-transcriptionally to inhibit gene expression. RESULTS: Here we report the analysis of miRNA expression in 93 primary human breast tumors, using a bead-based flow cytometric miRNA expression profiling method. Of 309 human miRNAs assayed, we identify 133 miRNAs expressed in human breast and breast tumors. We used mRNA expression profiling to classify the breast tumors as luminal A, luminal B, basal-like, HER2+ and normal-like. A number of miRNAs are differentially expressed between these molecular tumor subtypes and individual miRNAs are associated with clinicopathological factors. Furthermore, we find that miRNAs could classify basal versus luminal tumor subtypes in an independent data set. In some cases, changes in miRNA expression correlate with genomic loss or gain; in others, changes in miRNA expression are likely due to changes in primary transcription and or miRNA biogenesis. Finally, the expression of DICER1 and AGO2 is correlated with tumor subtype and may explain some of the changes in miRNA expression observed. CONCLUSION: This study represents the first integrated analysis of miRNA expression, mRNA expression and genomic changes in human breast cancer and may serve as a basis for functional studies of the role of miRNAs in the etiology of breast cancer. Furthermore, we demonstrate that bead-based flow cytometric miRNA expression profiling might be a suitable platform to classify breast cancer into prognostic molecular subtypes.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
MicroRNA expression profiling of human breast cancer identifies new markers of tumor subtype
Integrated analysis of miRNA expression and genomic changes in human breast tumors allows the classification of tumor subtypes
High-resolution aCGH and expression profiling identifies a novel genomic subtype of ER negative breast cancer.
BACKGROUND: The characterization of copy number alteration patterns in breast cancer requires high-resolution genome-wide profiling of a large panel of tumor specimens. To date, most genome-wide array comparative genomic hybridization studies have used tumor panels of relatively large tumor size and high Nottingham Prognostic Index (NPI) that are not as representative of breast cancer demographics. RESULTS: We performed an oligo-array-based high-resolution analysis of copy number alterations in 171 primary breast tumors of relatively small size and low NPI, which was therefore more representative of breast cancer demographics. Hierarchical clustering over the common regions of alteration identified a novel subtype of high-grade estrogen receptor (ER)-negative breast cancer, characterized by a low genomic instability index. We were able to validate the existence of this genomic subtype in one external breast cancer cohort. Using matched array expression data we also identified the genomic regions showing the strongest coordinate expression changes ('hotspots'). We show that several of these hotspots are located in the phosphatome, kinome and chromatinome, and harbor members of the 122-breast cancer CAN-list. Furthermore, we identify frequently amplified hotspots on 8q22.3 (EDD1, WDSOF1), 8q24.11-13 (THRAP6, DCC1, SQLE, SPG8) and 11q14.1 (NDUFC2, ALG8, USP35) associated with significantly worse prognosis. Amplification of any of these regions identified 37 samples with significantly worse overall survival (hazard ratio (HR) = 2.3 (1.3-1.4) p = 0.003) and time to distant metastasis (HR = 2.6 (1.4-5.1) p = 0.004) independently of NPI. CONCLUSION: We present strong evidence for the existence of a novel subtype of high-grade ER-negative tumors that is characterized by a low genomic instability index. We also provide a genome-wide list of common copy number alteration regions in breast cancer that show strong coordinate aberrant expression, and further identify novel frequently amplified regions that correlate with poor prognosis. Many of the genes associated with these regions represent likely novel oncogenes or tumor suppressors.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are
- …