57 research outputs found
Bayesian inference for spline-based hidden Markov models
B-spline-based hidden Markov models (HMMs), where the emission densities are
specified as mixtures of normalized B-spline basis functions, offer a more
flexible modelling approach to data than conventional parametric HMMs. We
introduce a fully Bayesian framework for inference in these nonparametric
models where the number of states may be unknown along with other model
parameters. We propose the use of a trans-dimensional Markov chain inference
algorithm to identify a parsimonious knot configuration of the B-splines while
model selection regarding the number of states can be performed within a
parallel sampling framework. The feasibility and efficiency of our proposed
methodology is shown in a simulation study. Its explorative use for real data
is demonstrated for activity acceleration data in animals, i.e.
whitetip-sharks. The flexibility of a Bayesian approach allows us to extend the
modelling framework in a straightforward way and we demonstrate this by
developing a hierarchical conditional HMM to analyse human accelerator activity
data to focus on studying small movements and/or inactivity during sleep
A temporal switch model for estimating transcriptional activity in gene expression
Motivation: The analysis and mechanistic modelling of time series gene expression data provided by techniques such as microarrays, NanoString, reverse transcription–polymerase chain reaction and advanced sequencing are invaluable for developing an understanding of the variation in key biological processes. We address this by proposing the estimation of a flexible dynamic model, which decouples temporal synthesis and degradation of mRNA and, hence, allows for transcriptional activity to switch between different states.
Results: The model is flexible enough to capture a variety of observed transcriptional dynamics, including oscillatory behaviour, in a way that is compatible with the demands imposed by the quality, time-resolution and quantity of the data. We show that the timing and number of switch events in transcriptional activity can be estimated alongside individual gene mRNA stability with the help of a Bayesian reversible jump Markov chain Monte Carlo algorithm. To demonstrate the methodology, we focus on modelling the wild-type behaviour of a selection of 200 circadian genes of the model plant Arabidopsis thaliana. The results support the idea that using a mechanistic model to identify transcriptional switch points is likely to strongly contribute to efforts in elucidating and understanding key biological processes, such as transcription and degradation
Bayesian Model Search for Nonstationary Periodic Time Series
We propose a novel Bayesian methodology for analyzing nonstationary time
series that exhibit oscillatory behaviour. We approximate the time series using
a piecewise oscillatory model with unknown periodicities, where our goal is to
estimate the change-points while simultaneously identifying the potentially
changing periodicities in the data. Our proposed methodology is based on a
trans-dimensional Markov chain Monte Carlo (MCMC) algorithm that simultaneously
updates the change-points and the periodicities relevant to any segment between
them. We show that the proposed methodology successfully identifies time
changing oscillatory behaviour in two applications which are relevant to
e-Health and sleep research, namely the occurrence of ultradian oscillations in
human skin temperature during the time of night rest, and the detection of
instances of sleep apnea in plethysmographic respiratory traces.Comment: Received 23 Oct 2018, Accepted 12 May 201
Bayesian spline-based hidden Markov models with applications to actimetry data and sleep analysis
B-spline-based hidden Markov models employ B-splines to specify the emission distributions, offering a more flexible modeling approach to data than conventional parametric HMMs. We introduce a Bayesian framework for inference, enabling the simultaneous estimation of all unknown model parameters including the number of states. A parsimonious knot configuration of the B-splines is identified by the use of a trans-dimensional Markov chain sampling algorithm, while model selection regarding the number of states can be performed based on the marginal likelihood within a parallel sampling framework. Using extensive simulation studies, we demonstrate the superiority of our methodology over alternative approaches as well as its robustness and scalability. We illustrate the explorative use of our methods for data on activity in animals, that is whitetip-sharks. The flexibility of our Bayesian approach also facilitates the incorporation of more realistic assumptions and we demonstrate this by developing a novel hierarchical conditional HMM to analyse human activity for circadian and sleep modeling. Supplementary materials for this article are available online
Bayesian inference for dynamic transcriptional regulation; the Hes1 system as a case study.
Motivation: In this study we address the problem of estimating the parameters of regulatory networks and provide the first application of Markov chain Monte Carlo (MCMC) methods to experimental data. As a case study we consider a stochastic model of the Hes1 system expressed in terms of stochastic differential equations (SDEs) to which rigorous likelihood methods of inference can be applied. When fitting continuous-time stochastic models to discretely observed time series the lengths of the sampling intervals are important, and much of our study addresses the problem when the data are sparse. Results: We estimate the parameters of an autoregulatory network providing results both for simulated and real experimental data from the Hes1 system. We develop an estimation algorithm using Markov chain Monte Carlo techniques which are flexible enough to allow for the imputation of latent data on a finer time scale and the presence of prior information about parameters which may be informed from other experiments as well as additional measurement error. Availability: Supplementary information is submitted with the paper. Contact
Predictability of individual circadian phase during daily routine for medical applications of circadian clocks
Background: Circadian timing of treatments can largely improve tolerability and efficacy in patients. Thus, drug metabolism and cell cycle are controlled by molecular clocks in each cell, and coordinated by the core body temperature 24-hour rhythm, which is generated by the hypothalamic pacemaker. Individual circadian phase is currently estimated with questionnaire-based chronotype, center-of-rest time, dim light melatonin onset (DLMO), or timing of CBT maximum (acrophase) or minimum (bathyphase).
Methods: We aimed at circadian phase determination and read-out during daily routine in volunteers stratified by sex and age. We measured (i) chronotype; (ii) q1min CBT using two electronic pills swallowed 24-hours apart; (iii) DLMO through hourly salivary samples from 18:00 to bedtime; (iv) q1min accelerations and surface temperature at anterior chest level for seven days, using a tele-transmitting sensor. Circadian phases were computed using cosinor and Hidden-Markov modelling. Multivariate regression identified the combination of biomarkers that best predicted core temperature circadian bathyphase.
Results: Amongst the 33 participants, individual circadian phases were spread over 5h10min (DLMO), 7h (CBT bathyphase) and 9h10 min (surface temperature acrophase). CBT bathyphase was accurately predicted, i.e. with an error <1h for 78.8% of the subjects, using a new digital health algorithm (INTime), combining time-invariant sex and chronotype score, with computed center-of-rest time and surface temperature bathyphase (adjusted R-squared = 0.637).
Conclusion: INTime provided a continuous and reliable circadian phase estimate in real time. This model helps integrate circadian clocks into precision medicine and will enable treatment timing personalisation following further validation
Direct measurement of transcription rates reveals multiple mechanisms for configuration of the Arabidopsis ambient temperature response
Background
Sensing and responding to ambient temperature is important for controlling growth and development of many organisms, in part by regulating mRNA levels. mRNA abundance can change with temperature, but it is unclear whether this results from changes in transcription or decay rates, and whether passive or active temperature regulation is involved.
Results
Using a base analog labelling method, we directly measured the temperature coefficient, Q10, of mRNA synthesis and degradation rates of the Arabidopsis transcriptome. We show that for most genes, transcript levels are buffered against passive increases in transcription rates by balancing passive increases in the rate of decay. Strikingly, for temperature-responsive transcripts, increasing temperature raises transcript abundance primarily by promoting faster transcription relative to decay and not vice versa, suggesting a global transcriptional process exists that controls mRNA abundance by temperature. This is partly accounted for by gene body H2A.Z which is associated with low transcription rate Q10, but is also influenced by other marks and transcription factor activities.
Conclusions
Our data show that less frequent chromatin states can produce temperature responses simply by virtue of their rarity and the difference between their thermal properties and those of the most common states, and underline the advantages of directly measuring transcription rate changes in dynamic systems, rather than inferring rates from changes in mRNA abundance.
Background
The mechanism for ambient temperature sensing in plants is unclear. Control of transcript levels is believed to be important in responses to temperature [1-4] but affects of ambient temperature on transcription and mRNA decay rates have not been measured. According to the work of Arrhenius [5] the temperature coefficient (Q10) of biochemical reactions is expected to be 2 to 3 at biological temperatures: yet less than 2% of Arabidopsis thaliana genes have a two-fold or greater difference in expression level between 17°C and 27°C [6]. The remaining genes either have rates buffered against changing temperatures, or passive increases in transcription rate must be offset by a balanced increase in decay rate, leading to higher turnover but static steady state levels. Despite this fundamental uncertainty, steady state transcriptomic responses to ambient temperature have been used to infer a role for chromatin modifications in temperature signaling [2,7].
4-Thiouracil (4SU) is a non-toxic base analogue that has been shown to be incorporated into mammalian and yeast mRNA during transcription [8-12]. Biotinylation and column separation allow 4SU-labeled RNA to be separated from unlabeled RNA, and transcriptomic analysis using the separated samples can be used to simultaneously calculate mRNA synthesis and decay rates [8]. Here we use 4SU labeling to measure transcription rates and determine the Q10 genome-wide of mRNA synthesis and decay rates in Arabidopsis thaliana. We show that ambient temperature has large passive effects on both mRNA synthesis and decay rates, and that where temperature controls transcript abundance it does so by regulating transcription relative to decay and not vice versa. Our analysis suggests that transcription factor binding sites and epigenetic state combine to create a complex network of temperature responses in plants.
Results
Cells incorporate 4SU into RNA and this has been exploited in mammalian cells [8,11,12] and in yeast [13] to measure mRNA synthesis and decay rates. In order to determine whether plants can take up 4SU we floated intact seedlings in MS medium and monitored 4SU incorporation into RNA by biotinylation and dot blot (Figure S1a in Additional file 1). This clearly showed that plants incorporate 4SU from the environment into RNA and that concentrations as low as 1 mM lead to a signal detectable above background within 1 hour (Figure 1B). The resulting RNA could be separated from unlabeled RNA by biotinylation and passage through a streptavidin column as described previously. At 1.5 mM the flow-through can be depleted of detectable 4SU-labeled RNA, whilst labeled plant RNA is highly concentrated in the fraction recovered from the column [8,13] (Figure S1c in Additional file 1). To maximize recovery we chose a low concentration of 4SU at 1.5 mM [8] as high labeling frequencies are known to lead to binding of fewer more frequently labeled transcripts to the columns and reduce recovery. At this concentration Arabidopsis plants treated with 4SU showed the same growth and survival as control plants (Figure S2a in Additional file 1), suggesting 4SU has low toxicity in plants, as in other organisms. Therefore, 4SU dynamics in Arabidopsis seedlings resemble those described for other experimental systems. Preliminary experiments showed that RNA turnover was faster at 27°C compared to 12°C (Figure S2b in Additional file 1), suggesting that temperature generally affected transcription rates
A spatio-temporal model to reveal oscillator phenotypes in molecular clocks: Parameter estimation elucidates circadian gene transcription dynamics in single-cells.
We propose a stochastic distributed delay model together with a Markov random field prior and a measurement model for bioluminescence-reporting to analyse spatio-temporal gene expression in intact networks of cells. The model describes the oscillating time evolution of molecular mRNA counts through a negative transcriptional-translational feedback loop encoded in a chemical Langevin equation with a probabilistic delay distribution. The model is extended spatially by means of a multiplicative random effects model with a first order Markov random field prior distribution. Our methodology effectively separates intrinsic molecular noise, measurement noise, and extrinsic noise and phenotypic variation driving cell heterogeneity, while being amenable to parameter identification and inference. Based on the single-cell model we propose a novel computational stability analysis that allows us to infer two key characteristics, namely the robustness of the oscillations, i.e. whether the reaction network exhibits sustained or damped oscillations, and the profile of the regulation, i.e. whether the inhibition occurs over time in a more distributed versus a more direct manner, which affects the cells' ability to phase-shift to new schedules. We show how insight into the spatio-temporal characteristics of the circadian feedback loop in the suprachiasmatic nucleus (SCN) can be gained by applying the methodology to bioluminescence-reported expression of the circadian core clock gene Cry1 across mouse SCN tissue. We find that while (almost) all SCN neurons exhibit robust cell-autonomous oscillations, the parameters that are associated with the regulatory transcription profile give rise to a spatial division of the tissue between the central region whose oscillations are resilient to perturbation in the sense that they maintain a high degree of synchronicity, and the dorsal region which appears to phase shift in a more diversified way as a response to large perturbations and thus could be more amenable to entrainment
A hierarchical model of transcriptional dynamics allows robust estimation of transcription rates in populations of single cells with variable gene copy number
Motivation: cis-regulatory DNA sequence elements, such as enhancers and silencers, function to control the spatial and temporal expression of their target genes. Although the overall levels of gene expression in large cell populations seem to be precisely controlled, transcription of individual genes in single cells is extremely variable in real time. It is, therefore, important to understand how these cis-regulatory elements function to dynamically control transcription at single-cell resolution. Recently, statistical methods have been proposed to back calculate the rates involved in mRNA transcription using parameter estimation of a mathematical model of transcription and translation. However, a major complication in these approaches is that some of the parameters, particularly those corresponding to the gene copy number and transcription rate, cannot be distinguished; therefore, these methods cannot be used when the copy number is unknown.
Results: Here, we develop a hierarchical Bayesian model to estimate biokinetic parameters from live cell enhancer–promoter reporter measurements performed on a population of single cells. This allows us to investigate transcriptional dynamics when the copy number is variable across the population. We validate our method using synthetic data and then apply it to quantify the function of two known developmental enhancers in real time and in single cells
- …