2,853 research outputs found
Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle
The effort to identify genes with periodic expression during the cell cycle
from genome-wide microarray time series data has been ongoing for a decade.
However, the lack of rigorous modeling of periodic expression as well as the
lack of a comprehensive model for integrating information across genes and
experiments has impaired the effort for the accurate identification of
periodically expressed genes. To address the problem, we introduce a Bayesian
model to integrate multiple independent microarray data sets from three recent
genome-wide cell cycle studies on fission yeast. A hierarchical model was used
for data integration. In order to facilitate an efficient Monte Carlo sampling
from the joint posterior distribution, we develop a novel Metropolis--Hastings
group move. A surprising finding from our integrated analysis is that more than
40% of the genes in fission yeast are significantly periodically expressed,
greatly enhancing the reported 10--15% of the genes in the current literature.
It calls for a reconsideration of the periodically expressed gene detection
problem.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS300 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Exploiting the full power of temporal gene expression profiling through a new statistical test: Application to the analysis of muscular dystrophy data
Background: The identification of biologically interesting genes in a temporal expression profiling
dataset is challenging and complicated by high levels of experimental noise. Most statistical methods
used in the literature do not fully exploit the temporal ordering in the dataset and are not suited
to the case where temporal profiles are measured for a number of different biological conditions.
We present a statistical test that makes explicit use of the temporal order in the data by fitting
polynomial functions to the temporal profile of each gene and for each biological condition. A
Hotelling T2-statistic is derived to detect the genes for which the parameters of these polynomials
are significantly different from each other.
Results: We validate the temporal Hotelling T2-test on muscular gene expression data from four
mouse strains which were profiled at different ages: dystrophin-, beta-sarcoglycan and gammasarcoglycan
deficient mice, and wild-type mice. The first three are animal models for different
muscular dystrophies. Extensive biological validation shows that the method is capable of finding
genes with temporal profiles significantly different across the four strains, as well as identifying
potential biomarkers for each form of the disease. The added value of the temporal test compared
to an identical test which does not make use of temporal ordering is demonstrated via a simulation
study, and through confirmation of the expression profiles from selected genes by quantitative PCR
experiments. The proposed method maximises the detection of the biologically interesting genes,
whilst minimising false detections.
Conclusion: The temporal Hotelling T2-test is capable of finding relatively small and robust sets
of genes that display different temporal profiles between the conditions of interest. The test is
simple, it can be used on gene expression data generated from any experimental design and for any
number of conditions, and it allows fast interpretation of the temporal behaviour of genes. The R
code is available from V.V. The microarray data have been submitted to GEO under series
GSE1574 and GSE3523
The Cyclohedron Test for Finding Periodic Genes in Time Course Expression Studies
The problem of finding periodically expressed genes from time course
microarray experiments is at the center of numerous efforts to identify the
molecular components of biological clocks. We present a new approach to this
problem based on the cyclohedron test, which is a rank test inspired by recent
advances in algebraic combinatorics. The test has the advantage of being robust
to measurement errors, and can be used to ascertain the significance of
top-ranked genes. We apply the test to recently published measurements of gene
expression during mouse somitogenesis and find 32 genes that collectively are
significant. Among these are previously identified periodic genes involved in
the Notch/FGF and Wnt signaling pathways, as well as novel candidate genes that
may play a role in regulating the segmentation clock. These results confirm
that there are an abundance of exceptionally periodic genes expressed during
somitogenesis. The emphasis of this paper is on the statistics and
combinatorics that underlie the cyclohedron test and its implementation within
a multiple testing framework.Comment: Revision consists of reorganization and further statistical
discussion; 19 pages, 4 figure
Nonlinear Model-Based Method for Clustering Periodically Expressed Genes
Clustering periodically expressed genes from their time-course expression data could help understand the molecular mechanism of those biological processes. In this paper, we propose a nonlinear model-based clustering method for periodically expressed gene profiles. As periodically expressed genes are associated with periodic biological processes, the proposed method naturally assumes that a periodically expressed gene dataset is generated by a number of periodical processes. Each periodical process is modelled by a linear combination of trigonometric sine and cosine functions in time plus a Gaussian noise term. A two stage method is proposed to estimate the model parameter, and a relocation-iteration algorithm is employed to assign each gene to an appropriate cluster. A bootstrapping method and an average adjusted Rand index (AARI) are employed to measure the quality of clustering. One synthetic dataset and two biological datasets were employed to evaluate the performance of the proposed method. The results show that our method allows the better quality clustering than other clustering methods (e.g., k-means) for periodically expressed gene data, and thus it is an effective cluster analysis method for periodically expressed gene data
Systems Biology and the Development of Vaccines and Drugs for Malaria Treatments
The sequencing race has ended and the functional race has already begun. Microarray technology enables
simultaneous gene expression analysis of thousands of genes, enabling a snapshot of an organisms’
transcriptome at an unprecedented resolution. The close correlation between gene transcription and
function, allow the inference of biological processes from the assessed transcriptome profile. Among the
sophisticated analytical problems in microarray technology at the front and back ends respectively, are the
selection of optimal DNA oligos and computational analysis of the genes expression. In this review paper,
we analyse important methods in use today in customized oligos design. In the course of executing this,
we discovered that the oligos designer algorithm hanged on gene PFA0135w of chromosome 1, while
designing oligos for the gene sequences of Plasmodium falciparum. We do not know the reason for this
yet, as the algorithm runs on other sequences like the yeast (Saccharomyces cervisiae) and Neurospora
crassa. We conclude the paper highlighting the procedures encompassing the back end phase and discuss
their application to the development of vaccines and drugs for malaria treatment. Note that, malaria is the
cause of significant global morbidity and mortality with 300-500 million cases annually. Our aims are not
ends, but a means to achieve the following: Iterate the need for experimental biologists to (i) know how to
design their customized oligos and (ii) have some idea about gene expression analysis and the need for
cooperation between experimental biologists and their counterpart, the computational biologists. These
will help experimental biologists to coordinate very well the front and the back ends of the system
biology analysis of the whole genome effectively
Cyclebase.org—a comprehensive multi-organism online database of cell-cycle experiments
The past decade has seen the publication of a large number of cell-cycle microarray studies and many more are in the pipeline. However, data from these experiments are not easy to access, combine and evaluate. We have developed a centralized database with an easy-to-use interface, Cyclebase.org, for viewing and downloading these data. The user interface facilitates searches for genes of interest as well as downloads of genome-wide results. Individual genes are displayed with graphs of expression profiles throughout the cell cycle from all available experiments. These expression profiles are normalized to a common timescale to enable inspection of the combined experimental evidence. Furthermore, state-of-the-art computational analyses provide key information on both individual experiments and combined datasets such as whether or not a gene is periodically expressed and, if so, the time of peak expression. Cyclebase is available at http://www.cyclebase.org
Recommended from our members
Unbiased Boolean analysis of public gene expression data for cell cycle gene identification.
Cell proliferation is essential for the development and maintenance of all organisms and is dysregulated in cancer. Using synchronized cells progressing through the cell cycle, pioneering microarray studies defined cell cycle genes based on cyclic variation in their expression. However, the concordance of the small number of synchronized cell studies has been limited, leading to discrepancies in definition of the transcriptionally regulated set of cell cycle genes within and between species. Here we present an informatics approach based on Boolean logic to identify cell cycle genes. This approach used the vast array of publicly available gene expression data sets to query similarity to CCNB1, which encodes the cyclin subunit of the Cdk1-cyclin B complex that triggers the G2-to-M transition. In addition to highlighting conservation of cell cycle genes across large evolutionary distances, this approach identified contexts where well-studied genes known to act during the cell cycle are expressed and potentially acting in nondivision contexts. An accessible web platform enables a detailed exploration of the cell cycle gene lists generated using the Boolean logic approach. The methods employed are straightforward to extend to processes other than the cell cycle
Time-Course Analysis of Cyanobacterium Transcriptome: Detecting Oscillatory Genes
The microarray technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining these data one can identify the dynamics of the gene expression time series. The detection of genes that are periodically expressed is an important step that allows us to study the regulatory mechanisms associated with the circadian cycle. The problem of finding periodicity in biological time series poses many challenges. Such challenge occurs due to the fact that the observed time series usually exhibit non-idealities, such as noise, short length, outliers and unevenly sampled time points. Consequently, the method for finding periodicity should preferably be robust against such anomalies in the data. In this paper, we propose a general and robust procedure for identifying genes with a periodic signature at a given significance level. This identification method is based on autoregressive models and the information theory. By using simulated data we show that the suggested method is capable of identifying rhythmic profiles even in the presence of noise and when the number of data points is small. By recourse of our analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis
- …