Search CORE

95 research outputs found

Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations

Author: A Nimgaonkar
Affymetrix
Affymetrix
AI Su
AJ Butte
AJ Butte
AT Adai
BH Mecham
BH Mecham
C Wu
CL Wilson
Crispin J Miller
E Birney
G Liu
G Sherlock
H Wang
HS Leong
J Harbig
J Stuart
KD Pruitt
L Gautier
L Gautier
M Dai
Michał J Okoniewski
O Teuffel
R Gentleman
R Irizarry
S Carter
S Zakharkin
T Attwood
W Shannon
Z Wu
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Microarrays measure the binding of nucleotide sequences to a set of sequence specific probes. This information is combined with annotation specifying the relationship between probes and targets and used to make inferences about transcript- and, ultimately, gene expression. In some situations, a probe is capable of hybridizing to more than one transcript, in others, multiple probes can target a single sequence. These 'multiply targeted' probes can result in non-independence between measured expression levels. RESULTS: An analysis of these relationships for Affymetrix arrays considered both the extent and influence of exact matches between probe and transcript sequences. For the popular HGU133A array, approximately half of the probesets were found to interact in this way. Both real and simulated expression datasets were used to examine how these effects influenced the expression signal. It was found not only to lead to increased signal strength for the affected probesets, but the major effect is to significantly increase their correlation, even in situations when only a single probe from a probeset was involved. By building a network of probe-probeset-transcript relationships, it is possible to identify families of interacting probesets. More than 10% of the families contain members annotated to different genes or even different Unigene clusters. Within a family, a mixture of genuine biological and artefactual correlations can occur. CONCLUSION: Multiple targeting is not only prevalent, but also significant. The ability of probesets to hybridize to more than one gene product can lead to false positives when analysing gene expression. Comprehensive annotation describing multiple targeting is required when interpreting array data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

PLANdbAffy: probe-level annotation database for Affymetrix expression microarrays

Author: Anna S. Ershova
Anna S. Karyagina
Benson
Chalifa-Caspi
Dai
Gautier
Harbig
Ilia S. Lossev
Irizarry
Johnson
Kent
Lemon
Leong
Liu
Mikhail O. Vasiliev
Okoniewski
Orlov
Pan
Pruitt
Ramil N. Nurtdinov
Sayers
Sherry
Wang
Wu
Yates
Yu
Zhang
Zhang
Zhou
Publication venue: Oxford University Press
Publication date
Field of study

Standard Affymetrix technology evaluates gene expression by measuring the intensity of mRNA hybridization with a panel of the 25-mer oligonucleotide probes, and summarizing the probe signal intensities by a robust average method. However, in many cases, signal intensity of the probe does not correlate with gene expression. This could be due to the hybridization of the probe to a transcript of another gene, mapping of the probe to an intron, alternative splicing, single nucleotide polymorphisms and other reasons. We have developed a database, PLANdbAffy (available at http://affymetrix2.bioinf.fbb.msu.ru), that contains the results of the alignment of probe sequences from five Affymetrix expression microarrays to the human genome. We have determined the probes matching the transcript-coding regions in the correct orientation. For each such probe alignment region, we determined the mRNA and EST sequences that contain the probe sequence. In the textual part of the database interface we summarize the data on the sequences that cover the probe alignment region and SNPs that are located inside it. The graphical part of our database interface is implemented as custom tracks to the UCSC genome browser that allows one to utilize all the data that are offered by UCSC browser

Crossref

PubMed Central

Jetset: selecting the optimal microarray probe set to represent a gene

Author: Birkbak Nicolai J
Eklund Aron C
Gyorffy Balazs
Li Qiyuan
Szallasi Zoltan
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Interpretation of gene expression microarrays requires a mapping from probe set to gene. On many Affymetrix gene expression microarrays, a given gene may be detected by multiple probe sets, which may deliver inconsistent or even contradictory measurements. Therefore, obtaining an unambiguous expression estimate of a pre-specified gene can be a nontrivial but essential task. Results We developed scoring methods to assess each probe set for specificity, splice isoform coverage, and robustness against transcript degradation. We used these scores to select a single representative probe set for each gene, thus creating a simple one-to-one mapping between gene and probe set. To test this method, we evaluated concordance between protein measurements and gene expression values, and between sets of genes whose expression is known to be correlated. For both test cases, we identified genes that were nominally detected by multiple probe sets, and we found that the probe set chosen by our method showed stronger concordance. Conclusions This method provides a simple, unambiguous mapping to allow assessment of the expression levels of specific genes of interest.</p

Crossref

Harvard University - DASH

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UCL Discovery

Semmelweis Repository

Online Research Database In Technology

Benchmarking of RNA-sequencing analysis workflows using whole-transcriptome RT-qPCR expression data

Author: Cheng Quek Xiu
Dinger Marcel E
Everaert Celine
Hellemans Jan
Luypaert Manuel
Maag Jesper LV
Mestdagh Pieter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

RNA-sequencing has become the gold standard for whole-transcriptome gene expression quanti cation. Multiple algorithms have been developed to derive gene counts from sequencing reads. While a number of benchmarking studies have been conducted, the question remains how individual methods perform at accurately quantifying gene expression levels from RNA-sequencing reads. We performed an independent benchmarking study using RNA-sequencing data from the well established MAQCA and MAQCB reference samples. RNA-sequencing reads were processed using five workflows (Tophat-HTSeq, Tophat-Cuflinks, STAR-HTSeq, Kallisto and Salmon) and resulting gene expression measurements were compared to expression data generated by wet-lab validated qPCR assays for all protein coding genes. All methods showed high gene expression correlations with qPCR data. When comparing gene expression fold changes between MAQCA and MAQCB samples, about 85% of the genes showed consistent results between RNA-sequencing and qPCR data. Of note, each method revealed a small but speci c gene set with inconsistent expression measurements. A significant proportion of these method-specific inconsistent genes were reproducibly identified in independent datasets. These genes were typically smaller, had fewer exons, and were lower expressed compared to genes with consistent expression measurements. We propose that careful validation is warranted when evaluating RNA-seq based expression profiles for this specific gene set

Ghent University Academic Bibliography

UNSWorks

Cross-hybridization modeling on Affymetrix exon arrays

Author: Affymetrix
Affymetrix
Boutz
Casneuf
Clark
Eklund
Gresham
Hubbard
Hui Jiang
Irizarry
Jiang
Johnson
Kapur
Karen Kapur
Li
Li
Mortazavi
Okoniewski
Smith
Srinivasan
Stoughton
Wing Hung Wong
Wu
Xing
Xing
Yeo
Yi Xing
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: Microarray designs have become increasingly probe-rich, enabling targeting of specific features, such as individual exons or single nucleotide polymorphisms. These arrays have the potential to achieve quantitative high-throughput estimates of transcript abundances, but currently these estimates are affected by biases due to cross-hybridization, in which probes hybridize to off-target transcripts

Crossref

PubMed Central

Optimization of the BLASTN substitution matrix for prediction of non-specific DNA microarray hybridization

Author: Eklund Aron C.
Friis Pia
Szallasi Zoltan
Wernersson Rasmus
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

DNA microarray measurements are susceptible to error caused by non-specific hybridization between a probe and a target (cross-hybridization), or between two targets (bulk-hybridization). Search algorithms such as BLASTN can quickly identify potentially hybridizing sequences. We set out to improve BLASTN accuracy by modifying the substitution matrix and gap penalties. We generated gene expression microarray data for samples in which 1 or 10% of the target mass was an exogenous spike of known sequence. We found that the 10% spike induced 2-fold intensity changes in 3% of the probes, two-third of which were decreases in intensity likely caused by bulk-hybridization. These changes were correlated with similarity between the spike and probe sequences. Interestingly, even very weak similarities tended to induce a change in probe intensity with the 10% spike. Using this data, we optimized the BLASTN substitution matrix to more accurately identify probes susceptible to non-specific hybridization with the spike. Relative to the default substitution matrix, the optimized matrix features a decreased score for A–T base pairs relative to G–C base pairs, resulting in a 5–15% increase in area under the ROC curve for identifying affected probes. This optimized matrix may be useful in the design of microarray probes, and in other BLASTN-based searches for hybridization partners

CiteSeerX

PubMed Central

Online Research Database In Technology

Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis

Author: A Kendall
A Mackay
A Naderi
A Tichopad
AA Shabalin
AC Culhane
AE Teschendorff
AH Sims
AH Sims
AH Sims
AL Oberg
Alexey A Larionov
Andrew H Sims
Arran K Turnbull
C Desmedt
CY Lin
GC Tseng
GW Snedecor
HS Leong
J Michael Dixon
J Neter
J Rudy
JM Engreitz
JM Engreitz
JS Parker
JT Leek
JT Leek
KR Ong
L Ein-Dor
L Shi
Lorna Renshaw
M Barnes
M Benito
M Dai
MB Eisen
MJ Okoniewski
ML Lindstrom
MN McCall
NL Barbosa-Morais
NM Laird
R Clarke
R Sandberg
R Shen
RC Gentleman
Robert R Kitchen
RR Kitchen
RR Kitchen
RR Kitchen
VG Tusher
VS Sabine
WE Johnson
WR Miller
WR Miller
WR Miller
X Fan
X Lu
Z Hu
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract Background Affymetrix GeneChips and Illumina BeadArrays are the most widely used commercial single channel gene expression microarrays. Public data repositories are an extremely valuable resource, providing array-derived gene expression measurements from many thousands of experiments. Unfortunately many of these studies are underpowered and it is desirable to improve power by combining data from more than one study; we sought to determine whether platform-specific bias precludes direct integration of probe intensity signals for combined reanalysis. Results Using Affymetrix and Illumina data from the microarray quality control project, from our own clinical samples, and from additional publicly available datasets we evaluated several approaches to directly integrate intensity level expression data from the two platforms. After mapping probe sequences to Ensembl genes we demonstrate that, ComBat and cross platform normalisation (XPN), significantly outperform mean-centering and distance-weighted discrimination (DWD) in terms of minimising inter-platform variance. In particular we observed that DWD, a popular method used in a number of previous studies, removed systematic bias at the expense of genuine biological variability, potentially reducing legitimate biological differences from integrated datasets. Conclusion Normalised and batch-corrected intensity-level data from Affymetrix and Illumina microarrays can be directly combined to generate biologically meaningful results with improved statistical power for robust, integrated reanalysis.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Secuenciación y análisis del transcriptoma de dalbulusmaidis

Author: Catalano María Inés
Lavore Andrés
Palacio Victorio Gabriel
Rivera Pomar Rolando
Publication venue
Publication date: 18/10/2018
Field of study

Los auquenorrincos (chicharritas o cotorritas) son insectos exclusivamente fitófagos, que pueden causar importantes daños económicos sobre los cultivos. Una de las enfermedades vectorizadas por ellos es el achaparramiento del maíz o Corn Stunt Disease, potencialmente una de las enfermedades más serias del cultivo de maíz, capaz de causar pérdidas parciales o totales en la producci ón en las zonas afectadas. En Argentina, Dalbulusmaidis (Hemiptera: Auchenorrhyncha) es el único vector a campo conocido como transmisor del Spiroplasmakunkelii , patógeno causal del Corn Stunt . Dada su importancia como plaga en la agricultura, se secuenció el transcriptoma de todos los estadios del ciclo de vida de este insecto (huevos, 5 estadios ninfales y dos muestras de adultos). Se utilizó un pool de insectos para abarcar la mayor cantidad de genes expresados. Como la información genómica de Dalbulusma idis no está disponible, se realizó el ensamblado de novo . Se compararon los ensambles realizados con 3 programas: VELVET OASES, ABySS y Trinity. Se evaluaron utilizando métricas (N50, longitud de contig ) y medidas de cobertura (CEG, BUSCO). En base a es tosanálisis, se decidió buscar genes del desarrollo en los ensambles de VELVET OASES y Trinity. El porcentaje total de genes encontrado fue mayor para el ensamble de Trinity. Teniendo en cuenta los resultados previos, se ensamblaron el resto de las muest ras con Trinity, obteniendo valores de métricas y coberturas muy buenos. Además se compararon los transcriptomas con proteomas publicados como medida de homología entre especies. En este trabajo se compararon distintos métodos de ensamble de novo y se selec cionó el que mejor se adaptó a nuestros datos y experimentosFil: Palacio, Victorio Gabriel. Universidad Nacional del Noroeste de la Provincia de Buenos AiresFil: Lavore, Andrés. Universidad Nacional del Noroeste de la Provincia de Buenos AiresFil: Catalano, María Inés . Universidad Nacional del Noroeste de la Provincia de Buenos AiresFil: Rivera Pomar, Rolando . Universidad Nacional del Noroeste de la Provincia de Buenos Aire

Repositorio OAI Biblioteca Digital Universidad Nacional de Cuyo