Search CORE

17,678 research outputs found

A non-Gaussian factor analysis approach to transcription Network Component Analysis

Author: Chen Runsheng
Luo Dingsheng
Tu Shikui
Xu Lei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Transcription factor activities (TFAs), rather than expression levels, control gene expression and provide valuable information for investigating TF-gene regulations. Network Component Analysis (NCA) is a model based method to deduce TFAs and TF-gene control strengths from microarray data and a priori TF-gene connectivity data. We modify NCA to model gene expression regulation by non-Gaussian Factor Analysis (NFA), which assumes TFAs independently comes from Gaussian mixture densities. We properly incorporate a priori connectivity and/or sparsity on the mixing matrix of NFA, and derive, under Bayesian Ying-Yang (BYY) learning framework, a BYY-NFA algorithm that can not only uncover the latent TFA profile similar to NCA, but also is capable of automatically shutting off unnecessary connections. Simulation study demonstrates the effectiveness of BYY-NFA, and a preliminary application to two real world data sets shows that BYY-NFA improves NCA for the case when TF-gene connectivity is not available or not reliable, and may provide a preliminary set of candidate TF-gene interactions or double check unreliable connections for experimental verification. ? 2012 IEEE.EI

Crossref

Sparse Linear Identifiable Multivariate Modeling

Author: Aapo Hyvärinen
Dtu Informatics
Ole Winther
Ricardo Henao
Richard Petersens Plads
Publication venue
Publication date: 01/01/2011
Field of study

In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully Bayesian hierarchy for sparse models using slab and spike priors (two-component delta-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear Identifiable Multivariate modeling) and allowing for correlations between latent variables, called CSLIM (Correlated SLIM), for the temporal and/or spatial data. The source code and scripts are available from http://cogsys.imm.dtu.dk/slim/.Comment: 45 pages, 17 figure

arXiv.org e-Print Archive

CiteSeerX

Online Research Database In Technology

Robust causal structure learning with some hidden variables

Author: Frot Benjamin
Maathuis Marloes H.
Nandy Preetam
Publication venue
Publication date: 04/08/2018
Field of study

We introduce a new method to estimate the Markov equivalence class of a directed acyclic graph (DAG) in the presence of hidden variables, in settings where the underlying DAG among the observed variables is sparse, and there are a few hidden variables that have a direct effect on many of the observed ones. Building on the so-called low rank plus sparse framework, we suggest a two-stage approach which first removes the effect of the hidden variables, and then estimates the Markov equivalence class of the underlying DAG under the assumption that there are no remaining hidden variables. This approach is consistent in certain high-dimensional regimes and performs favourably when compared to the state of the art, both in terms of graphical structure recovery and total causal effect estimation

arXiv.org e-Print Archive

Repository for Publications and Research Data

The Infinite Hierarchical Factor Regression Model

Author: Daumé III Hal
Rai Piyush
Publication venue
Publication date: 01/01/2008
Field of study

We propose a nonparametric Bayesian factor regression model that accounts for uncertainty in the number of factors, and the relationship between factors. To accomplish this, we propose a sparse variant of the Indian Buffet Process and couple this with a hierarchical model over factors, based on Kingman's coalescent. We apply this model to two problems (factor analysis and factor regression) in gene-expression data analysis

arXiv.org e-Print Archive

CiteSeerX

Fluctuations in Gene Regulatory Networks as Gaussian Colored Noise

Author: Alon U.
Arkin A.
Coddington E.
Jinn-Wen Wu
Karen G. Petrosyan
Ming-Chang Huang
Ptashne M.
Toral R.
van Kampen N. G.
Yu-Pin Luo
Publication venue: 'AIP Publishing'
Publication date: 22/12/2009
Field of study

The study of fluctuations in gene regulatory networks is extended to the case of Gaussian colored noise. Firstly, the solution of the corresponding Langevin equation with colored noise is expressed in terms of an Ito integral. Then, two important lemmas concerning the variance of an Ito integral and the covariance of two Ito integrals are shown. Based on the lemmas, we give the general formulae for the variances and covariance of molecular concentrations for a regulatory network near a stable equilibrium explicitly. Two examples, the gene auto-regulatory network and the toggle switch, are presented in details. In general, it is found that the finite correlation time of noise reduces the fluctuations and enhances the correlation between the fluctuations of the molecular components.Comment: 10 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Information capacity of genetic regulatory elements

Author: A. P. Gasch
C. E. Shannon
C. E. Shannon
C. E. Shannon
Curtis G. Callan
Gašper Tkačik
H. B. Barlow
J. D. Watson
N. G. van Kampen
P. A. Lawrence
S. B. Laughlin
T. M. Cover
William Bialek
Publication venue: 'American Physical Society (APS)'
Publication date: 26/09/2007
Field of study

Changes in a cell's external or internal conditions are usually reflected in the concentrations of the relevant transcription factors. These proteins in turn modulate the expression levels of the genes under their control and sometimes need to perform non-trivial computations that integrate several inputs and affect multiple genes. At the same time, the activities of the regulated genes would fluctuate even if the inputs were held fixed, as a consequence of the intrinsic noise in the system, and such noise must fundamentally limit the reliability of any genetic computation. Here we use information theory to formalize the notion of information transmission in simple genetic regulatory elements in the presence of physically realistic noise sources. The dependence of this "channel capacity" on noise parameters, cooperativity and cost of making signaling molecules is explored systematically. We find that, at least in principle, capacities higher than one bit should be achievable and that consequently genetic regulation is not limited the use of binary, or "on-off", components.Comment: 17 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Identifying stochastic oscillations in single-cell live imaging time series using Gaussian processes

Author: Manning Cerys
Papalopulu Nancy
Phillips Nick E.
Rattray Magnus
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 25/05/2017
Field of study

Multiple biological processes are driven by oscillatory gene expression at different time scales. Pulsatile dynamics are thought to be widespread, and single-cell live imaging of gene expression has lead to a surge of dynamic, possibly oscillatory, data for different gene networks. However, the regulation of gene expression at the level of an individual cell involves reactions between finite numbers of molecules, and this can result in inherent randomness in expression dynamics, which blurs the boundaries between aperiodic fluctuations and noisy oscillators. Thus, there is an acute need for an objective statistical method for classifying whether an experimentally derived noisy time series is periodic. Here we present a new data analysis method that combines mechanistic stochastic modelling with the powerful methods of non-parametric regression with Gaussian processes. Our method can distinguish oscillatory gene expression from random fluctuations of non-oscillatory expression in single-cell time series, despite peak-to-peak variability in period and amplitude of single-cell oscillations. We show that our method outperforms the Lomb-Scargle periodogram in successfully classifying cells as oscillatory or non-oscillatory in data simulated from a simple genetic oscillator model and in experimental data. Analysis of bioluminescent live cell imaging shows a significantly greater number of oscillatory cells when luciferase is driven by a {\it Hes1} promoter (10/19), which has previously been reported to oscillate, than the constitutive MoMuLV 5' LTR (MMLV) promoter (0/25). The method can be applied to data from any gene network to both quantify the proportion of oscillating cells within a population and to measure the period and quality of oscillations. It is publicly available as a MATLAB package.Comment: 36 pages, 17 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare