2,303 research outputs found

    Automated Bayesian model development for frequency detection in biological time series

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies.</p> <p>Results</p> <p>In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing.</p> <p>Conclusions</p> <p>Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time series often deviate significantly from the requirements of optimality for Fourier transformation. In this paper we present an alternative approach based on Bayesian inference. We show the value of placing spectral analysis in the framework of Bayesian inference and demonstrate how model comparison can automate this procedure.</p

    Functional assessment of time course microarray data

    Get PDF
    <p>Abstract</p> <p>Motivation</p> <p>Time-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated.</p> <p>Methods</p> <p>We present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies.</p> <p>Results</p> <p>Synthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.</p

    Molecular Epidemiology and Evolution of Human Respiratory Syncytial Virus and Human Metapneumovirus

    Get PDF
    Human respiratory syncytial virus (HRSV) and human metapneumovirus (HMPV) are ubiquitous respiratory pathogens of the Pneumovirinae subfamily of the Paramyxoviridae. Two major surface antigens are expressed by both viruses; the highly conserved fusion (F) protein, and the extremely diverse attachment (G) glycoprotein. Both viruses comprise two genetic groups, A and B. Circulation frequencies of the two genetic groups fluctuate for both viruses, giving rise to frequently observed switching of the predominantly circulating group. Nucleotide sequence data for the F and G gene regions of HRSV and HMPV variants from the UK, the Netherlands, Bangkok and data available from Genbank were used to identify clades of both viruses. Several contemporary circulating clades of HRSV and HMPV were identified by phylogenetic reconstructions. The molecular epidemiology and evolutionary dynamics of clades were modelled in parallel. Times of origin were determined and positively selected sites were identified. Sustained circulation of contemporary clades of both viruses for decades and their global dissemination demonstrated that switching of the predominant genetic group did not arise through the emergence of novel lineages each respiratory season, but through the fluctuating circulation frequencies of pre-existing lineages which undergo proliferative and eclipse phases. An abundance of sites were identified as positively selected within the G protein but not the F protein of both viruses. For HRSV, these were discordant with previously identified residues under selection, suggesting the virus can evade immune responses by generating diversity at multiple sites within linear epitopes. For both viruses, different sites were identified as positively selected between genetic groups

    High resolution temporal transcriptomics of mouse embryoid body development reveals complex expression dynamics of coding and noncoding loci.

    Get PDF
    Cellular responses to stimuli are rapid and continuous and yet the vast majority of investigations of transcriptional responses during developmental transitions typically use long interval time courses; limiting the available interpretive power. Moreover, such experiments typically focus on protein-coding transcripts, ignoring the important impact of long noncoding RNAs. We therefore evaluated coding and noncoding expression dynamics at unprecedented temporal resolution (6-hourly) in differentiating mouse embryonic stem cells and report new insight into molecular processes and genome organization. We present a highly resolved differentiation cascade that exhibits coding and noncoding transcriptional alterations, transcription factor network interactions and alternative splicing events, little of which can be resolved by long-interval developmental time-courses. We describe novel short lived and cycling patterns of gene expression and dissect temporally ordered gene expression changes in response to transcription factors. We elucidate patterns in gene co-expression across the genome, describe asynchronous transcription at bidirectional promoters and functionally annotate known and novel regulatory lncRNAs. These findings highlight the complex and dynamic molecular events underlying mammalian differentiation that can only be observed though a temporally resolved time course

    Statistical methods for high-throughput genomic data

    Get PDF

    Specific Age-Associated DNA Methylation Changes in Human Dermal Fibroblasts

    Get PDF
    Epigenetic modifications of cytosine residues in the DNA play a critical role for cellular differentiation and potentially also for aging. In mesenchymal stromal cells (MSC) from human bone marrow we have previously demonstrated age-associated methylation changes at specific CpG-sites of developmental genes. In continuation of this work, we have now isolated human dermal fibroblasts from young (<23 years) and elderly donors (>60 years) for comparison of their DNA methylation profiles using the Infinium HumanMethylation27 assay. In contrast to MSC, fibroblasts could not be induced towards adipogenic, osteogenic and chondrogenic lineage and this is reflected by highly significant differences between the two cell types: 766 CpG sites were hyper-methylated and 752 CpG sites were hypo-methylated in fibroblasts in comparison to MSC. Strikingly, global DNA methylation profiles of fibroblasts from the same dermal region clustered closely together indicating that fibroblasts maintain positional memory even after in vitro culture. 75 CpG sites were more than 15% differentially methylated in fibroblasts upon aging. Very high hyper-methylation was observed in the aged group within the INK4A/ARF/INK4b locus and this was validated by pyrosequencing. Age-associated DNA methylation changes were related in fibroblasts and MSC but they were often regulated in opposite directions between the two cell types. In contrast, long-term culture associated changes were very consistent in fibroblasts and MSC. Epigenetic modifications at specific CpG sites support the notion that aging represents a coordinated developmental mechanism that seems to be regulated in a cell type specific manner

    Statistical Physics and Representations in Real and Artificial Neural Networks

    Full text link
    This document presents the material of two lectures on statistical physics and neural representations, delivered by one of us (R.M.) at the Fundamental Problems in Statistical Physics XIV summer school in July 2017. In a first part, we consider the neural representations of space (maps) in the hippocampus. We introduce an extension of the Hopfield model, able to store multiple spatial maps as continuous, finite-dimensional attractors. The phase diagram and dynamical properties of the model are analyzed. We then show how spatial representations can be dynamically decoded using an effective Ising model capturing the correlation structure in the neural data, and compare applications to data obtained from hippocampal multi-electrode recordings and by (sub)sampling our attractor model. In a second part, we focus on the problem of learning data representations in machine learning, in particular with artificial neural networks. We start by introducing data representations through some illustrations. We then analyze two important algorithms, Principal Component Analysis and Restricted Boltzmann Machines, with tools from statistical physics

    Identification of co-regulated candidate genes by promoter analysis.

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Statistical methods for differential proteomics at peptide and protein level

    Get PDF
    corecore