Computational Approaches To Improving The Reconstruction Of Metabolic Pathway

Abstract

Metabolic pathway reconstruction is the essence of systems biology where in silico modeling and prediction of the cell's function is based on the interaction of the cell's components represented as a network of reactions. The reconstructed model and the associated database of information about the organism's genes and their functional roles facilitate a variety of analysis and simulation techniques that can enrich our understanding. However, there are unresolved issues for genome-scale metabolic network reconstruction, such as our incomplete knowledge of the cell's networks for metabolism, transport, and regulation; the completeness, accuracy, and specificity of the annotation of genomes; and our ability to fully utilise the available information from -omics (genomics, proteomics, metabolomics, etc) for the reconstruction of the networks. These issues result in incomplete metabolic models, which limit our ability to perform analysis of and to make predictions about the cell that are based on the network model. This dissertation discusses the state-of-the-art of metabolic pathway reconstruction and highlights the outstanding issues. In particular, we consider a number of case studies using genomes of fungi relevant to industrial applications, such as biofuels, to demonstrate the performance of existing techniques and illustrate the issues. Our case studies focus on the cell's central metabolism, and the utilisation and transport of sugars as a carbon source, since these are essential concerns for industrial applications. A significant deficiency in the existing state-of-the-art for the reconstruction of metabolic pathways is the ability to associate genes and proteins to the transport reactions that move specific compounds across the membranes of the cell. The dissertation reviews the state-of-the- art of prediction methods for transmembrane transport proteins by developing a scheme to describe and compare existing methods, and applying the existing techniques to the v fungal genome of A. niger CBS 513.88. This reveals the split between those methods that use the Transporter Classification (TC) as their target for prediction, and those that use the type of chemical substrates being transported as their target. Despite this difficulty in comparing approaches, it is clear that the state-of-the-art cannot predict specific substrates being transported, and hence cannot associate genes and proteins to the transport reactions. The dissertation presents TransATH, which stands for Transporters via ATH (Annotation Transfer by Homology), a system which automates Saier's protocol and includes the computation of subcellular localization and improves the computation of transmembrane segments. The choice of thresholds for the parameters of TransATH is investigated to determine optimal performance as defined by a gold standard set of transporters and non-transporters from S. cerevisiae. The dissertation demonstrates TransATH on the fungal genome of A. niger CBS 513.88 and evaluates the correctness of TransATH using the curated information in AspGD (the Aspergillus Database). A website for TransATH is available for use

    Similar works