18,727 research outputs found

    Using temporal correlation in factor analysis for reconstructing transcription factor activities

    Get PDF
    Two-level gene regulatory networks consist of the transcription factors (TFs) in the top level and their regulated genes in the second level. The expression profiles of the regulated genes are the observed high-throughput data given by experiments such as microarrays. The activity profiles of the TFs are treated as hidden variables as well as the connectivity matrix that indicates the regulatory relationships of TFs with their regulated genes. Factor analysis (FA) as well as other methods, such as the network component algorithm, has been suggested for reconstructing gene regulatory networks and also for predicting TF activities. They have been applied to E. coli and yeast data with the assumption that these datasets consist of identical and independently distributed samples. Thus, the main drawback of these algorithms is that they ignore any time correlation existing within the TF profiles. In this paper, we extend previously studied FA algorithms to include time correlation within the transcription factors. At the same time, we consider connectivity matrices that are sparse in order to capture the existing sparsity present in gene regulatory networks. The TFs activity profiles obtained by this approach are significantly smoother than profiles from previous FA algorithms. The periodicities in profiles from yeast expression data become prominent in our reconstruction. Moreover, the strength of the correlation between time points is estimated and can be used to assess the suitability of the experimental time interval

    Boolean Dynamics with Random Couplings

    Full text link
    This paper reviews a class of generic dissipative dynamical systems called N-K models. In these models, the dynamics of N elements, defined as Boolean variables, develop step by step, clocked by a discrete time variable. Each of the N Boolean elements at a given time is given a value which depends upon K elements in the previous time step. We review the work of many authors on the behavior of the models, looking particularly at the structure and lengths of their cycles, the sizes of their basins of attraction, and the flow of information through the systems. In the limit of infinite N, there is a phase transition between a chaotic and an ordered phase, with a critical phase in between. We argue that the behavior of this system depends significantly on the topology of the network connections. If the elements are placed upon a lattice with dimension d, the system shows correlations related to the standard percolation or directed percolation phase transition on such a lattice. On the other hand, a very different behavior is seen in the Kauffman net in which all spins are equally likely to be coupled to a given spin. In this situation, coupling loops are mostly suppressed, and the behavior of the system is much more like that of a mean field theory. We also describe possible applications of the models to, for example, genetic networks, cell differentiation, evolution, democracy in social systems and neural networks.Comment: 69 pages, 16 figures, Submitted to Springer Applied Mathematical Sciences Serie

    Revisiting the Training of Logic Models of Protein Signaling Networks with a Formal Approach based on Answer Set Programming

    Get PDF
    A fundamental question in systems biology is the construction and training to data of mathematical models. Logic formalisms have become very popular to model signaling networks because their simplicity allows us to model large systems encompassing hundreds of proteins. An approach to train (Boolean) logic models to high-throughput phospho-proteomics data was recently introduced and solved using optimization heuristics based on stochastic methods. Here we demonstrate how this problem can be solved using Answer Set Programming (ASP), a declarative problem solving paradigm, in which a problem is encoded as a logical program such that its answer sets represent solutions to the problem. ASP has significant improvements over heuristic methods in terms of efficiency and scalability, it guarantees global optimality of solutions as well as provides a complete set of solutions. We illustrate the application of ASP with in silico cases based on realistic networks and data

    Predicting protein functions with message passing algorithms

    Full text link
    Motivation: In the last few years a growing interest in biology has been shifting towards the problem of optimal information extraction from the huge amount of data generated via large scale and high-throughput techniques. One of the most relevant issues has recently become that of correctly and reliably predicting the functions of observed but still functionally undetermined proteins starting from information coming from the network of co-observed proteins of known functions. Method: The method proposed in this article is based on a message passing algorithm known as Belief Propagation, which takes as input the network of proteins physical interactions and a catalog of known proteins functions, and returns the probabilities for each unclassified protein of having one chosen function. The implementation of the algorithm allows for fast on-line analysis, and can be easily generalized to more complex graph topologies taking into account hyper-graphs, {\em i.e.} complexes of more than two interacting proteins.Comment: 12 pages, 9 eps figures, 1 additional html tabl

    Hearing the clusters in a graph: A distributed algorithm

    Full text link
    We propose a novel distributed algorithm to cluster graphs. The algorithm recovers the solution obtained from spectral clustering without the need for expensive eigenvalue/vector computations. We prove that, by propagating waves through the graph, a local fast Fourier transform yields the local component of every eigenvector of the Laplacian matrix, thus providing clustering information. For large graphs, the proposed algorithm is orders of magnitude faster than random walk based approaches. We prove the equivalence of the proposed algorithm to spectral clustering and derive convergence rates. We demonstrate the benefit of using this decentralized clustering algorithm for community detection in social graphs, accelerating distributed estimation in sensor networks and efficient computation of distributed multi-agent search strategies

    A path following algorithm for the graph matching problem

    Get PDF
    We propose a convex-concave programming approach for the labeled weighted graph matching problem. The convex-concave programming formulation is obtained by rewriting the weighted graph matching problem as a least-square problem on the set of permutation matrices and relaxing it to two different optimization problems: a quadratic convex and a quadratic concave optimization problem on the set of doubly stochastic matrices. The concave relaxation has the same global minimum as the initial graph matching problem, but the search for its global minimum is also a hard combinatorial problem. We therefore construct an approximation of the concave problem solution by following a solution path of a convex-concave problem obtained by linear interpolation of the convex and concave formulations, starting from the convex relaxation. This method allows to easily integrate the information on graph label similarities into the optimization problem, and therefore to perform labeled weighted graph matching. The algorithm is compared with some of the best performing graph matching methods on four datasets: simulated graphs, QAPLib, retina vessel images and handwritten chinese characters. In all cases, the results are competitive with the state-of-the-art.Comment: 23 pages, 13 figures,typo correction, new results in sections 4,5,

    Community detection in networks via nonlinear modularity eigenvectors

    Get PDF
    Revealing a community structure in a network or dataset is a central problem arising in many scientific areas. The modularity function QQ is an established measure quantifying the quality of a community, being identified as a set of nodes having high modularity. In our terminology, a set of nodes with positive modularity is called a \textit{module} and a set that maximizes QQ is thus called \textit{leading module}. Finding a leading module in a network is an important task, however the dimension of real-world problems makes the maximization of QQ unfeasible. This poses the need of approximation techniques which are typically based on a linear relaxation of QQ, induced by the spectrum of the modularity matrix MM. In this work we propose a nonlinear relaxation which is instead based on the spectrum of a nonlinear modularity operator M\mathcal M. We show that extremal eigenvalues of M\mathcal M provide an exact relaxation of the modularity measure QQ, however at the price of being more challenging to be computed than those of MM. Thus we extend the work made on nonlinear Laplacians, by proposing a computational scheme, named \textit{generalized RatioDCA}, to address such extremal eigenvalues. We show monotonic ascent and convergence of the method. We finally apply the new method to several synthetic and real-world data sets, showing both effectiveness of the model and performance of the method
    • 

    corecore