8,875 research outputs found
Identifying stochastic oscillations in single-cell live imaging time series using Gaussian processes
Multiple biological processes are driven by oscillatory gene expression at
different time scales. Pulsatile dynamics are thought to be widespread, and
single-cell live imaging of gene expression has lead to a surge of dynamic,
possibly oscillatory, data for different gene networks. However, the regulation
of gene expression at the level of an individual cell involves reactions
between finite numbers of molecules, and this can result in inherent randomness
in expression dynamics, which blurs the boundaries between aperiodic
fluctuations and noisy oscillators. Thus, there is an acute need for an
objective statistical method for classifying whether an experimentally derived
noisy time series is periodic. Here we present a new data analysis method that
combines mechanistic stochastic modelling with the powerful methods of
non-parametric regression with Gaussian processes. Our method can distinguish
oscillatory gene expression from random fluctuations of non-oscillatory
expression in single-cell time series, despite peak-to-peak variability in
period and amplitude of single-cell oscillations. We show that our method
outperforms the Lomb-Scargle periodogram in successfully classifying cells as
oscillatory or non-oscillatory in data simulated from a simple genetic
oscillator model and in experimental data. Analysis of bioluminescent live cell
imaging shows a significantly greater number of oscillatory cells when
luciferase is driven by a {\it Hes1} promoter (10/19), which has previously
been reported to oscillate, than the constitutive MoMuLV 5' LTR (MMLV) promoter
(0/25). The method can be applied to data from any gene network to both
quantify the proportion of oscillating cells within a population and to measure
the period and quality of oscillations. It is publicly available as a MATLAB
package.Comment: 36 pages, 17 figure
Global testing against sparse alternatives in time-frequency analysis
In this paper, an over-sampled periodogram higher criticism (OPHC) test is
proposed for the global detection of sparse periodic effects in a
complex-valued time series. An explicit minimax detection boundary is
established between the rareness and weakness of the complex sinusoids hidden
in the series. The OPHC test is shown to be asymptotically powerful in the
detectable region. Numerical simulations illustrate and verify the
effectiveness of the proposed test. Furthermore, the periodogram over-sampled
by is proven universally optimal in global testing for
periodicities under a mild minimum separation condition.Comment: Published at http://dx.doi.org/10.1214/15-AOS1412 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Peridocity, Change Detection and Prediction in Microarrays
Three topics in the analysis of microarray genomic data are discussed and improved statistical methods are developed in each case. A statistical test with higher power is developed for detecting periodicity in microarray time series data. Periodicity in short series, with non-Fourier frequencies, is detected through a Pearson curve calibrated to the null distribution obtained by computer simulation. Unlike other traditional methods, this approach is applicable even in the presence of missing values or unequal time intervals. The usefulness of the new method is demonstrated on simulated series as well as actual microarray time series.
The second topic develops a new method for detection of changes in DNA or gene copy number. Regions for DNA copy number aberrations in chromosomal material are detected using maximum overlapping discrete wavelet transform (MODWT). It is shown how repeated application of MODWT to a series can be used to confirm the presence of change points. Application to simulated as well as array CGH (Comparative Genomic Hybridization) data confirms the excellent performance of this method. In the third topic, it is shown that an improved class predictor for tissue samples in microarray experiments is developed by incorporating nearest neighbour covariates (NNC). It is demonstrated that this method reduces the mis-classification errors in both simulated and actual microarray data
Comprehensive analysis of circadian periodic pattern in plant transcriptome
Abstract Background Circadian rhythm is a crucial factor in orchestration of plant physiology, keeping it in synchrony with the daylight cycle. Previous studies have reported that up to 16% of plant transcriptome are circadially expressed. Results Our studies of mammalian gene expression revealed circadian baseline oscillation in nearly 100% of genes. Here we present a comprehensive analysis of periodicity in two independent data sets. Application of the advanced algorithms and analytic approached already tested on animal data reveals oscillation in almost every gene of Arabidopsis thaliana. Conclusion This study indicates an even more pervasive role of oscillation in molecular physiology of plants than previously believed. Earlier studies have dramatically underestimated the prevalence of circadian oscillation in plant gene expression.</p
Spectral estimation in unevenly sampled space of periodically expressed microarray time series data
BACKGROUND: Periodogram analysis of time-series is widespread in biology. A new challenge for analyzing the microarray time series data is to identify genes that are periodically expressed. Such challenge occurs due to the fact that the observed time series usually exhibit non-idealities, such as noise, short length, and unevenly sampled time points. Most methods used in the literature operate on evenly sampled time series and are not suitable for unevenly sampled time series. RESULTS: For evenly sampled data, methods based on the classical Fourier periodogram are often used to detect periodically expressed gene. Recently, the Lomb-Scargle algorithm has been applied to unevenly sampled gene expression data for spectral estimation. However, since the Lomb-Scargle method assumes that there is a single stationary sinusoid wave with infinite support, it introduces spurious periodic components in the periodogram for data with a finite length. In this paper, we propose a new spectral estimation algorithm for unevenly sampled gene expression data. The new method is based on signal reconstruction in a shift-invariant signal space, where a direct spectral estimation procedure is developed using the B-spline basis. Experiments on simulated noisy gene expression profiles show that our algorithm is superior to the Lomb-Scargle algorithm and the classical Fourier periodogram based method in detecting periodically expressed genes. We have applied our algorithm to the Plasmodium falciparum and Yeast gene expression data and the results show that the algorithm is able to detect biologically meaningful periodically expressed genes. CONCLUSION: We have proposed an effective method for identifying periodic genes in unevenly sampled space of microarray time series gene expression data. The method can also be used as an effective tool for gene expression time series interpolation or resampling
- âŚ