Search CORE

2,341 research outputs found

Using the R Package crlmm for Genotyping and Copy Number Estimation

Author: Benilton Carvalho
Ingo Ruczinski
Matthew E. Ritchie
Rafael A. Irizarry
Robert B. Scharpf
Publication venue
Publication date
Field of study

Genotyping platforms such as Affymetrix can be used to assess genotype-phenotype as well as copy number-phenotype associations at millions of markers. While genotyping algorithms are largely concordant when assessed on HapMap samples, tools to assess copy number changes are more variable and often discordant. One explanation for the discordance is that copy number estimates are susceptible to systematic differences between groups of samples that were processed at different times or by different labs. Analysis algorithms that do not adjust for batch effects are prone to spurious measures of association. The R package crlmm implements a multilevel model that adjusts for batch effects and provides allele-specific estimates of copy number. This paper illustrates a workflow for the estimation of allele-specific copy number and integration of the marker-level estimates with complimentary Bioconductor software for inferring regions of copy number gain or loss. All analyses are performed in the statistical environment R.

Research Papers in Economics

Illumina WG-6 BeadChip strips should be normalized separately

Author: Banerjee Ashish
Gerondakis Steve
Ritchie Matthew E
Shi Wei
Smyth Gordon K
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Illumina Sentrix-6 Whole-Genome Expression BeadChips are relatively new microarray platforms which have been used in many microarray studies in the past few years. These Chips have a unique design in which each Chip contains six microarrays and each microarray consists of two separate physical strips, posing special challenges for precise between-array normalization of expression values. Results None of the normalization strategies proposed so far for this microarray platform allow for the possibility of systematic variation between the two strips comprising each array. That this variation can be substantial is illustrated by a data example. We demonstrate that normalizing at the strip-level rather than at the array-level can effectively remove this between-strip variation, improve the precision of gene expression measurements and discover more differentially expressed genes. The gain is substantial, yielding a 20% increase in statistical information and doubling the number of genes detected at a 5% false discovery rate. Functional analysis reveals that the extra genes found tend to have interesting biological meanings, dramatically strengthening the biological conclusions from the experiment. Strip-level normalization still outperforms array-level normalization when non-expressed probes are filtered out. Conclusion Plots are proposed which demonstrate how the need for strip-level normalization relates to inconsistent intensity range variation between the strips. Strip-level normalization is recommended for the preprocessing of Illumina Sentrix-6 BeadChips whenever the intensity range is seen to be inconsistent between the strips. R code is provided to implement the recommended plots and normalization algorithms.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Using the R Package crlmm for Genotyping and Copy Number Estimation

Author: Carvalho Benilton
Irizarry Rafael A.
Ritchie Matthew E.
Ruczinski Ingo
Scharpf Robert B.
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 01/05/2011
Field of study

Directory of Open Access Journals

PubMed Central

Journal of Statistical Software

Spike-in validation of an Illumina-specific variance-stabilizing transformation

Author: Barbosa-Morais Nuno L
Dunning Mark J
Lynch Andy G
Ritchie Matthew E
Tavaré Simon
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

BACKGROUND: Variance-stabilizing techniques have been used for some time in the analysis of gene expression microarray data. A new adaptation, the variance-stabilizing transformation (VST), has recently been developed to take advantage of the unique features of Illumina BeadArrays. VST has been shown to perform well in comparison with the widely-used approach of taking a log2 transformation, but has not been validated on a spike-in experiment. We apply VST to the data from a recently published spike-in experiment and compare it both to a regular log2 analysis and a recently recommended analysis that can be applied if all raw data are available. FINDINGS: VST provides more power to detect differentially expressed genes than a log2 transformation. However, the gain in power is roughly the same as utilizing the raw data from an experiment and weighting observations accordingly. VST is still advantageous when large changes in expression are anticipated, while a weighted log2 approach performs better for smaller changes. CONCLUSION: VST can be recommended for summarized Illumina data regardless of which Illumina pre-processing options have been used. However, using the raw data is still encouraged whenever possible

Crossref

Springer - Publisher Connector

PubMed Central

University of Melbourne Institutional Repository

University of St. Andrews - Pure

limma powers differential expression analyses for RNA-sequencing and microarray studies

Author: Hu Yifang
Law Charity W.
Phipson Belinda
Ritchie Matthew E.
Shi Wei
Smyth Gordon K.
Wu Di
Publication venue
Publication date: 02/08/2017
Field of study

limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously describe

RERO DOC Digital Library

Empirical array quality weights in the analysis of microarray data

Author: Diyagama Dileepa
Dobrovic Alexander
Holloway Andrew
Neilson Jody
Ritchie Matthew E
Smyth Gordon K
van Laar Ryan
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Assessment of array quality is an essential step in the analysis of data from microarray experiments. Once detected, less reliable arrays are typically excluded or "filtered" from further analysis to avoid misleading results. RESULTS: In this article, a graduated approach to array quality is considered based on empirical reproducibility of the gene expression measures from replicate arrays. Weights are assigned to each microarray by fitting a heteroscedastic linear model with shared array variance terms. A novel gene-by-gene update algorithm is used to efficiently estimate the array variances. The inverse variances are used as weights in the linear model analysis to identify differentially expressed genes. The method successfully assigns lower weights to less reproducible arrays from different experiments. Down-weighting the observations from suspect arrays increases the power to detect differential expression. In smaller experiments, this approach outperforms the usual method of filtering the data. The method is available in the limma software package which is implemented in the R software environment. CONCLUSION: This method complements existing normalisation and spot quality procedures, and allows poorer quality arrays, which would otherwise be discarded, to be included in an analysis. It is applicable to microarray data from experiments with some level of replication

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Recommended from our members

Autotransporters but not pAA are critical for rabbit colonization by Shiga toxin-producing Escherichia coli O104:H4

Author: Bronson Rod
Davis Brigid M
Fang Gang
Hatzios Stavroula K
Munera Diana
Ritchie Jennifer M
Schadt Eric E
Waldor Matthew K
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/01/2014
Field of study

The outbreak of diarrhea and hemolytic uremic syndrome that occurred in Germany in 2011 was caused by a Shiga toxin-producing enteroaggregative Escherichia coli (EAEC) strain. The strain was classified as EAEC due to the presence of a plasmid (pAA) that mediates a characteristic pattern of aggregative adherence on cultured cells, the defining feature of EAEC that has classically been associated with virulence. Here, we describe an infant rabbit-based model of intestinal colonization and diarrhea caused by the outbreak strain, which we use to decipher the factors that mediate the pathogen’s virulence. Shiga toxin is the key factor required for diarrhea. Unexpectedly, we observe that pAA is dispensable for intestinal colonization and development of intestinal pathology. Instead, chromosome-encoded autotransporters are critical for robust colonization and diarrheal disease in this model. Our findings suggest that conventional wisdom linking aggregative adherence to EAEC intestinal colonization is false for at least a subset of strains

Harvard University - DASH

Surrey Research Insight

High-resolution transcription atlas of the mitotic cell cycle in budding yeast.

Author: Bork Peer
Granovskaia Marina V
Huber Wolfgang
Jensen Lars J
Ning Ye
Ritchie Matthew E
Steinmetz Lars M
Toedling Joern
Publication venue: Genome Biol
Publication date: 01/01/2010
Field of study

RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.BACKGROUND: Extensive transcription of non-coding RNAs has been detected in eukaryotic genomes and is thought to constitute an additional layer in the regulation of gene expression. Despite this role, their transcription through the cell cycle has not been studied; genome-wide approaches have only focused on protein-coding genes. To explore the complex transcriptome architecture underlying the budding yeast cell cycle, we used 8 bp tiling arrays to generate a 5 minute-resolution, strand-specific expression atlas of the whole genome. RESULTS: We discovered 523 antisense transcripts, of which 80 cycle or are located opposite periodically expressed mRNAs, 135 unannotated intergenic non-coding RNAs, of which 11 cycle, and 109 cell-cycle-regulated protein-coding genes that had not previously been shown to cycle. We detected periodic expression coupling of sense and antisense transcript pairs, including antisense transcripts opposite of key cell-cycle regulators, like FAR1 and TAF2. CONCLUSIONS: Our dataset presents the most comprehensive resource to date on gene expression during the budding yeast cell cycle. It reveals periodic expression of both protein-coding and non-coding RNA and profiles the expression of non-annotated RNAs throughout the cell cycle for the first time. This data enables hypothesis-driven mechanistic studies concerning the functions of non-coding RNAs

PubMed Central

Copenhagen University Research Information System

Apollo (Cambridge)

MDC Repository

University of Melbourne Institutional Repository

A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data.

Author: Barbosa-Morais Nuno L
Darot Jeremy FJ
Dunning Mark J
Lynch Andy G
Ritchie Matthew E
Samarajiwa Shamith A
Tavaré Simon
Publication venue: Nucleic Acids Res
Publication date: 18/11/2009
Field of study

Illumina BeadArrays are among the most popular and reliable platforms for gene expression profiling. However, little external scrutiny has been given to the design, selection and annotation of BeadArray probes, which is a fundamental issue in data quality and interpretation. Here we present a pipeline for the complete genomic and transcriptomic re-annotation of Illumina probe sequences, also applicable to other platforms, with its output available through a Web interface and incorporated into Bioconductor packages. We have identified several problems with the design of individual probes and we show the benefits of probe re-annotation on the analysis of BeadArray gene expression data sets. We discuss the importance of aspects such as probe coverage of individual transcripts, alternative messenger RNA splicing, single-nucleotide polymorphisms, repeat sequences, RNA degradation biases and probes targeting genomic regions with no known transcription. We conclude that many of the Illumina probes have unreliable original annotation and that our re-annotation allows analyses to focus on the good quality probes, which form the majority, and also to expand the scope of biological information that can be extracted

PubMed Central

Apollo (Cambridge)

University of St. Andrews - Pure