Search CORE

Jetset: selecting the optimal microarray probe set to represent a gene

Author: Birkbak Nicolai J
Eklund Aron C
Gyorffy Balazs
Li Qiyuan
Szallasi Zoltan
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Interpretation of gene expression microarrays requires a mapping from probe set to gene. On many Affymetrix gene expression microarrays, a given gene may be detected by multiple probe sets, which may deliver inconsistent or even contradictory measurements. Therefore, obtaining an unambiguous expression estimate of a pre-specified gene can be a nontrivial but essential task. Results We developed scoring methods to assess each probe set for specificity, splice isoform coverage, and robustness against transcript degradation. We used these scores to select a single representative probe set for each gene, thus creating a simple one-to-one mapping between gene and probe set. To test this method, we evaluated concordance between protein measurements and gene expression values, and between sets of genes whose expression is known to be correlated. For both test cases, we identified genes that were nominally detected by multiple probe sets, and we found that the probe set chosen by our method showed stronger concordance. Conclusions This method provides a simple, unambiguous mapping to allow assessment of the expression levels of specific genes of interest.</p

Harvard University - DASH

Online Research Database In Technology

UCL Discovery

Semmelweis Repository

Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: High-resolution annotation for microarrays

Author: Cam Margaret C
Lee Joseph C
Lu Jun
Salit Marc L
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Extracting biological information from high-density Affymetrix arrays is a multi-step process that begins with the accurate annotation of microarray probes. Shortfalls in the original Affymetrix probe annotation have been described; however, few studies have provided rigorous solutions for routine data analysis. RESULTS: Using AceView, a comprehensive human transcript database, we have reannotated the probes by matching them to RNA transcripts instead of genes. Based on this transcript-level annotation, a new probe set definition was created in which every probe in a probe set maps to a common set of AceView gene transcripts. In addition, using artificial data sets we identified that a minimal probe set size of 4 is necessary for reliable statistical summarization. We further demonstrate that applying the new probe set definition can detect specific transcript variants contributing to differential expression and it also improves cross-platform concordance. CONCLUSION: We conclude that our transcript-level reannotation and redefinition of probe sets complement the original Affymetrix design. Redefinitions introduce probe sets whose sizes may not support reliable statistical summarization; therefore, we advocate using our transcript-level mapping redefinition in a secondary analysis step rather than as a replacement. Knowing which specific transcripts are differentially expressed is important to properly design probe/primer pairs for validation purposes. For convenience, we have created custom chip-description-files (CDFs) and annotation files for our new probe set definitions that are compatible with Bioconductor, Affymetrix Expression Console or third party software

Gene expression AffyProbeMiner: a web resource for computing or retrieving accurately redefined Affymetrix probe sets

Author: A Gunes Koru
Alessandro Ferrucci
Antej Nuhanovic
Ari Kahn
Barry R Zeeberg
David W Kane
Gang Qu
Hongfang Liu
John N Weinstein
Michael C Ryan
Peter J Munson
William C Reinhold
Publication venue
Publication date: 01/01/2007
Field of study

CiteSeerX

Application of a correlation correction factor in a microarray cross-platform reproducibility study

Author: Archer Kellie J.
Chaplin Michael D.
Dumur Catherine I.
Ferreira-Gonzalez Andrea
Garrett Carleton T.
Grant Geraldine
Guiseppi-Elie Anthony
Taylor G. Scott
Publication venue: VCU Scholars Compass
Publication date: 01/01/2007
Field of study

Background Recent research examining cross-platform correlation of gene expression intensities has yielded mixed results. In this study, we demonstrate use of a correction factor for estimating cross-platform correlations. Results In this paper, three technical replicate microarrays were hybridized to each of three platforms. The three platforms were then analyzed to assess both intra- and cross-platform reproducibility. We present various methods for examining intra-platform reproducibility. We also examine cross-platform reproducibility using Pearson\u27s correlation. Additionally, we previously developed a correction factor for Pearson\u27s correlation which is applicable when X and Y are measured with error. Herein we demonstrate that correcting for measurement error by estimating the disattenuated correlation substantially improves cross-platform correlations. Conclusion When estimating cross-platform correlation, it is essential to thoroughly evaluate intra-platform reproducibility as a first step. In addition, since measurement error is present in microarray gene expression data, methods to correct for attenuation are useful in decreasing the bias in cross-platform correlation estimates

VCU Scholars Compass

GATExplorer: Genomic and Transcriptomic Explorer; mapping expression probes to gene loci, transcripts, exons and ncRNAs

Author: De Las Rivas Javier
Dinger Marcel E
Fontanillo Celia
Risueño Alberto
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Genome-wide expression studies have developed exponentially in recent years as a result of extensive use of microarray technology. However, expression signals are typically calculated using the assignment of "probesets" to genes, without addressing the problem of "gene" definition or proper consideration of the location of the measuring probes in the context of the currently known genomes and transcriptomes. Moreover, as our knowledge of metazoan genomes improves, the number of both protein-coding and noncoding genes, as well as their associated isoforms, continues to increase. Consequently, there is a need for new databases that combine genomic and transcriptomic information and provide updated mapping of expression probes to current genomic annotations.Results: GATExplorer (Genomic and Transcriptomic Explorer) is a database and web platform that integrates a gene loci browser with nucleotide level mappings of oligo probes from expression microarrays. It allows interactive exploration of gene loci, transcripts and exons of human, mouse and rat genomes, and shows the specific location of all mappable Affymetrix microarray probes and their respective expression levels in a broad set of biological samples. The web site allows visualization of probes in their genomic context together with any associated protein-coding or noncoding transcripts. In the case of all-exon arrays, this provides a means by which the expression of the individual exons within a gene can be compared, thereby facilitating the identification and analysis of alternatively spliced exons. The application integrates data from four major source databases: Ensembl, RNAdb, Affymetrix and GeneAtlas; and it provides the users with a series of files and packages (R CDFs) to analyze particular query expression datasets. The maps cover both the widely used Affymetrix GeneChip microarrays based on 3' expression (e.g. human HG U133 series) and the all-exon expression microarrays (Gene 1.0 and Exon 1.0).Conclusions: GATExplorer is an integrated database that combines genomic/transcriptomic visualization with nucleotide-level probe mapping. By considering expression at the nucleotide level rather than the gene level, it shows that the arrays detect expression signals from entities that most researchers do not contemplate or discriminate. This approach provides the means to undertake a higher resolution analysis of microarray data and potentially extract considerably more detailed and biologically accurate information from existing and future microarray experiments

University of Queensland eSpace

Digital.CSIC

Evaluation of Microarray Preprocessing Algorithms Based on Concordance with RT-PCR in Clinical Samples

Author: Aron C. Eklund
Balazs Gyorffy
Bela Molnar
Chad Creighton
Hermann Lage
Zoltan Szallasi
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

BACKGROUND Several preprocessing algorithms for Affymetrix gene expression microarrays have been developed, and their performance on spike-in data sets has been evaluated previously. However, a comprehensive comparison of preprocessing algorithms on samples taken under research conditions has not been performed. METHODOLOGY/PRINCIPAL FINDINGS We used TaqMan RT-PCR arrays as a reference to evaluate the accuracy of expression values from Affymetrix microarrays in two experimental data sets: one comprising 84 genes in 36 colon biopsies, and the other comprising 75 genes in 29 cancer cell lines. We evaluated consistency using the Pearson correlation between measurements obtained on the two platforms. Also, we introduce the log-ratio discrepancy as a more relevant measure of discordance between gene expression platforms. Of nine preprocessing algorithms tested, PLIER+16 produced expression values that were most consistent with RT-PCR measurements, although the difference in performance between most of the algorithms was not statistically significant. CONCLUSIONS/SIGNIFICANCE Our results support the choice of PLIER+16 for the preprocessing of clinical Affymetrix microarray data. However, other algorithms performed similarly and are probably also good choices

Repository of the Academy's Library

Semmelweis Repository

Online Research Database In Technology

Novel definition files for human GeneChips based on GeneAnnot

Author: Bicciato Silvio
Bortoluzzi Stefania
Coppe Alessandro
Danieli Gian Antonio
Ferrari Francesco
Ferrari Sergio
Lancet Doron
Safran Marilyn
Shmoish Michael
Sirota Alexandra
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Improvements in genome sequence annotation revealed discrepancies in the original probeset/gene assignment in Affymetrix microarray and the existence of differences between annotations and effective alignments of probes and transcription products. In the current generation of Affymetrix human GeneChips, most probesets include probes matching transcripts from more than one gene and probes which do not match any transcribed sequence. Results We developed a novel set of custom Chip Definition Files (CDF) and the corresponding Bioconductor libraries for Affymetrix human GeneChips, based on the information contained in the GeneAnnot database. GeneAnnot-based CDFs are composed of unique custom-probesets, including only probes matching a single gene. Conclusion GeneAnnot-based custom CDFs solve the problem of a reliable reconstruction of expression levels and eliminate the existence of more than one probeset per gene, which often leads to discordant expression signals for the same transcript when gene differential expression is the focus of the analysis. GeneAnnot CDFs are freely distributed and fully compliant with Affymetrix standards and all available software for gene expression analysis. The CDF libraries are available from <url>http://www.xlab.unimo.it/GA_CDF</url>, along with supplementary information (CDF libraries, installation guidelines and R code, CDF statistics, and analysis results).</p

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Archivio istituzionale della ricerca - Università di Padova

Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations

Author: A Nimgaonkar
Affymetrix
Affymetrix
AI Su
AJ Butte
AJ Butte
AT Adai
BH Mecham
BH Mecham
C Wu
CL Wilson
Crispin J Miller
E Birney
G Liu
G Sherlock
H Wang
HS Leong
J Harbig
J Stuart
KD Pruitt
L Gautier
L Gautier
M Dai
Michał J Okoniewski
O Teuffel
R Gentleman
R Irizarry
S Carter
S Zakharkin
T Attwood
W Shannon
Z Wu
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Microarrays measure the binding of nucleotide sequences to a set of sequence specific probes. This information is combined with annotation specifying the relationship between probes and targets and used to make inferences about transcript- and, ultimately, gene expression. In some situations, a probe is capable of hybridizing to more than one transcript, in others, multiple probes can target a single sequence. These 'multiply targeted' probes can result in non-independence between measured expression levels. RESULTS: An analysis of these relationships for Affymetrix arrays considered both the extent and influence of exact matches between probe and transcript sequences. For the popular HGU133A array, approximately half of the probesets were found to interact in this way. Both real and simulated expression datasets were used to examine how these effects influenced the expression signal. It was found not only to lead to increased signal strength for the affected probesets, but the major effect is to significantly increase their correlation, even in situations when only a single probe from a probeset was involved. By building a network of probe-probeset-transcript relationships, it is possible to identify families of interacting probesets. More than 10% of the families contain members annotated to different genes or even different Unigene clusters. Within a family, a mixture of genuine biological and artefactual correlations can occur. CONCLUSION: Multiple targeting is not only prevalent, but also significant. The ability of probesets to hybridize to more than one gene product can lead to false positives when analysing gene expression. Comprehensive annotation describing multiple targeting is required when interpreting array data

Consistent annotation of gene expression arrays

Author: Ballester Benoît
Flicek Paul
Johnson Nathan
Proctor Glenn
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Gene expression arrays are valuable and widely used tools for biomedical research. Today's commercial arrays attempt to measure the expression level of all of the genes in the genome. Effectively translating the results from the microarray into a biological interpretation requires an accurate mapping between the probesets on the array and the genes that they are targeting. Although major array manufacturers provide annotations of their gene expression arrays, the methods used by various manufacturers are different and the annotations are difficult to keep up to date in the rapidly changing world of biological sequence databases. Results We have created a consistent microarray annotation protocol applicable to all of the major array manufacturers. We constantly keep our annotations updated with the latest Ensembl Gene predictions, and thus cross-referenced with a large number of external biomedical sequence database identifiers. We show that these annotations are accurate and address in detail reasons for the minority of probesets that cannot be annotated. Annotations are publicly accessible through the Ensembl Genome Browser and programmatically through the Ensembl Application Programming Interface. They are also seamlessly integrated into the BioMart data-mining tool and the biomaRt package of BioConductor. Conclusions Consistent, accurate and updated gene expression array annotations remain critical for biological research. Our annotations facilitate accurate biological interpretation of gene expression profiles.</p

HAL AMU