Search CORE

2,647 research outputs found

Comparison of sequence-dependent tiling array normalization approaches

Author: Chung Ho-Ryun
Vingron Martin
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The detection of enriched DNA or RNA fragments by tiling microarrays has become more and more popular. These microarrays contain a high number of small probes covering genomic loci. However, to achieve high coverage the probe sequences cannot be selected for their hybridization properties. The affinity of the probes towards their targets varies in a sequence-dependent manner. In order to remove this bias a number of approaches have been developed and shown to increase the detection of enriched DNA or RNA fragments. However, these approaches also employ a peak detection algorithm that is different from the one used previously. Thus, it seems possible that the enhancement of detection is due to the peak detection algorithm rather than the sequence-dependent normalization. Results We compared three different sequence-dependent probe level normalization procedures to a naïve sequence-independent normalization technique. In order to achieve maximal comparability, we used the normalized intensity values as input to a single peak detection algorithm. A so-called "spike-in" data set served as benchmark for the performance. We will show that the sequence-dependent normalization procedures do not perform better than the naïve approach, suggesting that the benefit of using these normalization approaches is limited. Furthermore, we will show that the naïve approach does well, because it effectively removes the sequence-dependent component of the measured intensities with the help of the control hybridization experiment. Conclusion Sequence-dependent normalization of microarray data hardly improves the detection of enriched DNA or RNA fragments. The "success" of the sequence-independent naïve approach is only possible due to the control experiment and requires proper scaling of the measured intensities.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Doubly stochastic continuous-time hidden Markov approach for analyzing genome tiling arrays

Author: Johnson W. Evan
Liu Jun S.
Liu X. Shirley
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

Microarrays have been developed that tile the entire nonrepetitive genomes of many different organisms, allowing for the unbiased mapping of active transcription regions or protein binding sites across the entire genome. These tiling array experiments produce massive correlated data sets that have many experimental artifacts, presenting many challenges to researchers that require innovative analysis methods and efficient computational algorithms. This paper presents a doubly stochastic latent variable analysis method for transcript discovery and protein binding region localization using tiling array data. This model is unique in that it considers actual genomic distance between probes. Additionally, the model is designed to be robust to cross-hybridized and nonresponsive probes, which can often lead to false-positive results in microarray experiments. We apply our model to a transcript finding data set to illustrate the consistency of our method. Additionally, we apply our method to a spike-in experiment that can be used as a benchmark data set for researchers interested in developing and comparing future tiling array methods. The results indicate that our method is very powerful, accurate and can be used on a single sample and without control experiments, thus defraying some of the overhead cost of conducting experiments on tiling arrays.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS248 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Model-based analysis of two-color arrays (MA2C)

Author: Chen Runsheng
Johnson W Evan
Li Wei
Liu Jun S
Liu X Shirley
Manrai Arjun K
Song Jun S
Zhang Xinmin
Zhu Xiaopeng
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

A normalization method based on probe GC content for two-color tiling arrays and an algorithm for detecting peak regions are presented. They are available in a stand-alone Java program

Crossref

Harvard University - DASH

Springer - Publisher Connector

PubMed Central

Custom Design and Analysis of High-Density Oligonucleotide Bacterial Tiling Microarrays

Author: Alexander D. Rowe
Gard O. S. Thomassen
Janet Kelso
Jessica M. Lindvall
Karin Lagesen
Torbjørn Rognes
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Not until recently have custom made high-density oligonucleotide microarrays been available at an affordable price. The aim of this thesis was to design microarrays and analysis algorithms for DNA repair and DNA damage detection, and to apply the methods in real experiments. Thomassen et al. have used their custom designed whole genome-tiling microarrays for detection of transcriptional changes in Escherichia coli after exposure to DNA damageing reagents. The transcriptional changes in E. coli treated with UV light or the methylating reagent MNNG were shown to be larger and to include far more genes than previously reported. To optimize the data analysis for the custom made arrays, Thomassen and coworkers designed their own normalization and analysis algorithms, and showed these more suitable than established methods that are currently applied on custom tiling arrays. Among other findings several novel stress-induced transcripts were detected, of which one is predicted to be a UV-induced short transmembrane protein. Additionally, no upregulation of the previously described UV-inducible aidB is shown. In the MNNG study several genes are shown as downregulated in response to DNA damage although having upstream regulatory sequences similar to the established LexA box A and B. This indicates that the LexA regulon also might control gene repression and that the box A and B sequence can not alone answer for the LexA controlled gene regulation. Thomassen et al. have also custom designed a microarray for oncogenic fusion gene detection. Cancer specific fusion genes are often used to subgroup cancers and to define the optimal treatment, but currently the laboratory detection procedure is both laborious and tedious. In a blinded study on six cancer cell lines proof of principle was shown by detection of six out of six positive controls. The design and analysis methods for this microarray are now being refined to make a diagnostic fusion gene detection tool

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

NORA - Norwegian Open Research Archives

Starr: Simple Tiling ARRay analysis of Affymetrix ChIP-chip data

Author: Kuan Pei Fen
Tresch Achim
Zacher Benedikt
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Chromatin immunoprecipitation combined with DNA microarrays (ChIP-chip) is an assay used for investigating DNA-protein-binding or post-translational chromatin/histone modifications. As with all high-throughput technologies, it requires thorough bioinformatic processing of the data for which there is no standard yet. The primary goal is to reliably identify and localize genomic regions that bind a specific protein. Further investigation compares binding profiles of functionally related proteins, or binding profiles of the same proteins in different genetic backgrounds or experimental conditions. Ultimately, the goal is to gain a mechanistic understanding of the effects of DNA binding events on gene expression. Results We present a free, open-source R/Bioconductor package <it>Starr </it>that facilitates comparative analysis of ChIP-chip data across experiments and across different microarray platforms. The package provides functions for data import, quality assessment, data visualization and exploration. <it>Starr </it>includes high-level analysis tools such as the alignment of ChIP signals along annotated features, correlation analysis of ChIP signals with complementary genomic data, peak-finding and comparative display of multiple clusters of binding profiles. It uses standard Bioconductor classes for maximum compatibility with other software. Moreover, <it>Starr </it>automatically updates microarray probe annotation files by a highly efficient remapping of microarray probe sequences to an arbitrary genome. Conclusion <it>Starr </it>is an R package that covers the complete ChIP-chip workflow from data processing to binding pattern detection. It focuses on the high-level data analysis, e.g., it provides methods for the integration and combined statistical analysis of binding profiles and complementary functional genomics data. <it>Starr </it>enables systematic assessment of binding behaviour for groups of genes that are alingned along arbitrary genomic features.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Access LMU

Carolina Digital Repository

Improved ChIP-chip analysis by a mixture model approach

Author: A Barski
A Savitzky
AP Fejes
B Efron
B Ren
BW Silverman
C Workman
DS Johnson
ER Mardis
GE Crawford
H Ji
H Ji
Ian J Davis
J Rozowsky
J Steinier
JA Berger
JD Lieb
JD Storey
JS Song
M Zheng
MA Newton
MA Newton
Michael J Buck
MJ Buck
MJ Buck
Mukund Patel
PG Giresi
PG Giresi
PJ Sabo
R Development Core Team
R Gottardo
S Cawley
S Keles
S Keles
TH Kim
TH Kim
W Li
W Sun
WE Johnson
Wei Sun
WH Press
Y Benjamini
YH Yang
Z Wang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Microarray analysis of immunoprecipitated chromatin (ChIP-chip) has evolved from a novel technique to a standard approach for the systematic study of protein-DNA interactions. In ChIP-chip, sites of protein-DNA interactions are identified by signals from the hybridization of selected DNA to tiled oligomers and are graphically represented as peaks. Most existing methods were designed for the identification of relatively sparse peaks, in the presence of replicates. Results We propose a data normalization method and a statistical method for peak identification from ChIP-chip data based on a mixture model approach. In contrast to many existing methods, including methods that also employ mixture model approaches, our method is more flexible by imposing less restrictive assumptions and allowing a relatively large proportion of peak regions. In addition, our method does not require experimental replicates and is computationally efficient. We compared the performance of our method with several representative existing methods on three datasets, including a spike-in dataset. These comparisons demonstrate that our approach is more robust and has comparable or higher power than the other methods, especially in the context of abundant peak regions. Conclusion Our data normalization and peak detection methods have improved performance to detect peak regions in ChIP-chip data.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Carolina Digital Repository

Definition of the σW regulon of Bacillus subtilis in the absence of stress

Author: A Petersohn
A Saito
Adam Driks
AJ Jervis
AW Kingston
BG Butcher
BM Alba
BM Bolstad
C Eymann
CD Ellermeier
E Padan
Emma L. Denham
G Chen
H Hahne
I Wadenpohl
J Cheng
J Heinrich
J Heinrich
J Heinrich
J Heinrich
Jan Maarten van Dijl
JC Zweers
JD Helmann
Jessica C. Zweers
K Asai
K Kanehara
K Strimmer
KT Hughes
L Steil
M Cao
M Cao
M Cao
M Ogura
M Pietiainen
M Yoshimura
MS Turner
P Bisicchia
P Nicolas
Pierre Nicolas
RE Dalbey
S Dubrac
S Jordan
S Leskela
S Rasmussen
S Schobel
S Sterberg
S Tojo
T Mascher
T Wiegert
TD Ho
Thomas Wiegert
W Eiamphungporn
X Huang
X Huang
Y Luo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Bacteria employ extracytoplasmic function (ECF) sigma factors for their responses to environmental stresses. Despite intensive research, the molecular dissection of ECF sigma factor regulons has remained a major challenge due to overlaps in the ECF sigma factor-regulated genes and the stimuli that activate the different ECF sigma factors. Here we have employed tiling arrays to single out the ECF σW regulon of the Gram-positive bacterium Bacillus subtilis from the overlapping ECF σX, σY, and σM regulons. For this purpose, we profiled the transcriptome of a B. subtilis sigW mutant under non-stress conditions to select candidate genes that are strictly σW-regulated. Under these conditions, σW exhibits a basal level of activity. Subsequently, we verified the σW-dependency of candidate genes by comparing their transcript profiles to transcriptome data obtained with the parental B. subtilis strain 168 grown under 104 different conditions, including relevant stress conditions, such as salt shock. In addition, we investigated the transcriptomes of rasP or prsW mutant strains that lack the proteases involved in the degradation of the σW anti-sigma factor RsiW and subsequent activation of the σW-regulon. Taken together, our studies identify 89 genes as being strictly σW-regulated, including several genes for non-coding RNAs. The effects of rasP or prsW mutations on the expression of σW-dependent genes were relatively mild, which implies that σW-dependent transcription under non-stress conditions is not strictly related to RasP and PrsW. Lastly, we show that the pleiotropic phenotype of rasP mutant cells, which have defects in competence development, protein secretion and membrane protein production, is not mirrored in the transcript profile of these cells. This implies that RasP is not only important for transcriptional regulation via σW, but that this membrane protease also exerts other important post-transcriptional regulatory functions

University of Groningen

Directory of Open Access Journals

HAL Descartes

Warwick Research Archives Portal Repository

ProdInra

Hal-Diderot

FigShare

Public Library of Science (PLOS)

Crossref

Proceedings - University of Groningen

ARTS repository - University of Groningen

PubMed Central

Dissertations of the University of Groningen

The Dawning Era of Comprehensive Transcriptome Analysis in Cellular Microbiology

Author: Aikawa Chihiro
Maruyama Fumito
Nakagawa Ichiro
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2010
Field of study

Bacteria rapidly change their transcriptional patterns during infection in order to adapt to the host environment. To investigate host–bacteria interactions, various strategies including the use of animal infection models, in vitro assay systems and microscopic observations have been used. However, these studies primarily focused on a few specific genes and molecules in bacteria. High-density tiling arrays and massively parallel sequencing analyses are rapidly improving our understanding of the complex host–bacterial interactions through identification and characterization of bacterial transcriptomes. Information resulting from these high-throughput techniques will continue to provide novel information on the complexity, plasticity, and regulation of bacterial transcriptomes as well as their adaptive responses relative to pathogenecity. Here we summarize recent studies using these new technologies and discuss the utility of transcriptome analysis

Crossref

PubMed Central

Frontiers - Publisher Connector

Error-pooling-based statistical methods for identifying novel temporal replication profiles of human chromosomes observed by DNA tiling arrays

Author: Beissbarth
Cho
Dean
Irizarry
Jae K. Lee
Jain
Jeon
Ji
Kampa
Li
Simon
Stefan Bekiranov
Storey
Storey
Taesung Park
White
Woodfine
Youngchul Kim
Publication venue: Oxford University Press
Publication date
Field of study

Statistical analysis on tiling array data is extremely challenging due to the astronomically large number of sequence probes, high noise levels of individual probes and limited number of replicates in these data. To overcome these difficulties, we first developed statistical error estimation and weighted ANOVA modeling approaches to high-density tiling array data, especially the former based on an advanced error-pooling method to accurately obtain heterogeneous technical error of small-sample tiling array data. Based on these approaches, we analyzed the high-density tiling array data of the temporal replication patterns during cell-cycle S phase of synchronized HeLa cells on human chromosomes 21 and 22. We found many novel temporal replication patterns, identifying about 26% of over 1 million tiling array sequence probes with significant differential replication during the four 2-h time periods of S phase. Among these differentially replicated probes, 126 941 sequence probes were matched to 417 known genes. The majority of these genes were found to be replicated within one or two consecutive time periods, while the others were replicated at two non-consecutive time periods. Also, coding regions found to be more differentially replicated in particular time periods than noncoding regions in the gene-poor chromosome 21 (25% differentially replicated among genic probes versus 18.6% among intergenic probes), while such a phenomenon was less prominent in gene-rich chromosome 22. A rigorous statistical testing for local proximity of differentially replicated genic and intergenic probes was performed to identify significant stretches of differentially replicated sequence regions. From this analysis, we found that adjacent genes were frequently replicated at different time periods, potentially implying the existence of quite dense replication origins. Evaluating the conditional probability significance of identified gene ontology terms on chromosomes 21 and 22, we detected some over-represented molecular functions and biological processes among these differentially replicated genes, such as the ones relevant to hydrolase, transferase and receptor-binding activities. Some of these results were confirmed showing >70% consistency with cDNA microarray data that were independently generated in parallel with the tiling arrays. Thus, our improved analysis approaches specifically designed for high-density tiling array data enabled us to reliably and sensitively identify many novel temporal replication patterns on human chromosomes

Crossref

PubMed Central

TileProbe: modeling tiling array probe effects using publicly available data

Author: Barrett
Bernstein
Bertone
Bolstad
Carroll
Cawley
Hongkai Ji
Huber
Irizarry
Jennifer Toolan Judy
Ji
Ji
Johnson
Kampa
Kapranov
Kapur
Keles
Li
Li
Liu
Ozsolak
Shendure
Urban
Weber
Wu
Yuan
Zhang
Zilliox
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: Individual probes on an Affymetrix tiling array usually behave differently. Modeling and removing these probe effects are critical for detecting signals from the array data. Current data processing techniques either require control samples or use probe sequences to model probe-specific variability, such as with MAT. Although the MAT approach can be applied without control samples, residual probe effects continue to distort the true biological signals

Crossref

PubMed Central