155 research outputs found

    Machine learning methods for detecting structure in metabolic flow networks

    Get PDF
    Metabolic flow networks are large scale, mechanistic biological models with good predictive power. However, even when they provide good predictions, interpreting the meaning of their structure can be very difficult, especially for large networks which model entire organisms. This is an underaddressed problem in general, and the analytic techniques that exist currently are difficult to combine with experimental data. The central hypothesis of this thesis is that statistical analysis of large datasets of simulated metabolic fluxes is an effective way to gain insight into the structure of metabolic networks. These datasets can be either simulated or experimental, allowing insight on real world data while retaining the large sample sizes only easily possible via simulation. This work demonstrates that this approach can yield results in detecting structure in both a population of solutions and in the network itself. This work begins with a taxonomy of sampling methods over metabolic networks, before introducing three case studies, of different sampling strategies. Two of these case studies represent, to my knowledge, the largest datasets of their kind, at around half a million points each. This required the creation of custom software to achieve this in a reasonable time frame, and is necessary due to the high dimensionality of the sample space. Next, a number of techniques are described which operate on smaller datasets. These techniques, focused on pairwise comparison, show what can be achieved with these smaller datasets, and how in these cases, visualisation techniques are applicable which do not have simple analogues with larger datasets. In the next chapter, Similarity Network Fusion is used for the first time to cluster organisms across several levels of biological organisation, resulting in the detection of discrete, quantised biological states in the underlying datasets. This quantisation effect was maintained across both real biological data and Monte-Carlo simulated data, with related underlying biological correlates, implying that this behaviour stems from the network structure itself, rather than from the genetic or regulatory mechanisms that would normally be assumed. Finally, Hierarchical Block Matrices are used as a model of multi-level network structure, by clustering reactions using a variety of distance metrics: first standard network distance measures, then by Local Network Learning, a novel approach of measuring connection strength via the gain in predictive power of each node on its neighbourhood. The clusters uncovered using this approach are validated against pre-existing subsystem labels and found to outperform alternative techniques. Overall this thesis represents a significant new approach to metabolic network structure detection, as both a theoretical framework and as technological tools, which can readily be expanded to cover other classes of multilayer network, an under explored datatype across a wide variety of contexts. In addition to the new techniques for metabolic network structure detection introduced, this research has proved fruitful both in its use in applied biological research and in terms of the software developed, which is experiencing substantial usage.EPSR

    Elucidating Flux Regulation of the Fermentation Modes of Lactococcus lactis:A Mutlilevel Study

    Get PDF

    Geometry of nonequilibrium reaction networks

    Full text link
    Building on Kirchhoff's treatment of electrical circuits, Hill and Schnakenberg - among others - proposed a celebrated theory for the thermodynamics of Markov processes and linear biochemical networks that exploited tools from graph theory to build fundamental nonequilibrium observables. However, such simple geometrical interpretation does not carry through for arbitrary chemical reaction networks because reactions can be many-to-many and are thus represented by a hypergraph, rather than a graph. Here we generalize some of the geometric intuitions behind the Hill-Schnakenberg approach to arbitrary reaction networks. In particular, we give simple procedures to build bases of cycles (encoding stationary nonequilibrium behavior) and cocycles (encoding relaxation), to interpret them in terms of circulations and gradients, and to use them to properly project nonequilibrium observables onto the relevant subspaces. We develop the theory for chemical reaction networks endowed with mass-action kinetics and enrich the description with insights from the corresponding stochastic models. Finally, basing on the linear regime assumption, we deploy the formalism to propose a reconstruction algorithm for metabolic networks consistent with Kirchhoff's Voltage and Current Laws.Comment: 36 pages, 10 figure

    The Convex Hull Problem in Practice : Improving the Running Time of the Double Description Method

    Get PDF
    The double description method is a simple but widely used algorithm for computation of extreme points in polyhedral sets. One key aspect of its implementation is the question of how to efficiently test extreme points for adjacency. In this dissertation, two significant contributions related to adjacency testing are presented. First, the currently used data structures are revisited and various optimizations are proposed. Empirical evidence is provided to demonstrate their competitiveness. Second, a new adjacency test is introduced. It is a refinement of the well known algebraic test featuring a technique for avoiding redundant computations. Its correctness is formally proven. Its superiority in multiple degenerate scenarios is demonstrated through experimental results. Parallel computation is one further aspect of the double description method covered in this work. A recently introduced divide-and-conquer technique is revisited and considerable practical limitations are demonstrated
    • …
    corecore