多様な生物学的データを複雑な代謝ネットワークに統合するための補完的エレメンタリーモード解析法の開発

Badsha, Md. Bahadur

多様な生物学的データを複雑な代謝ネットワークに統合するための補完的エレメンタリーモード解析法の開発

Authors: Md. Bahadur Badsha
Publication date: 1 March 2015
Publisher: 九州工業大学

Abstract

九州工業大学博士学位論文学位記番号:情工博甲第299号学位授与年月日:平成27年3月25日1 Introduction||2 Background||3 Materials and Methods||4 Results and Discussions||5 Conclusion, Scope and Future Research InterestSystems biotechnology is an approach to develop comprehensive and ultimately predictive models of how components of a biological system reproduce its observed behavior. The major human diseases like as diabetes, obesity, high blood pressure, cardiovascular disease and cancer are involved in failure of human metabolic systems. Therefore, metabolism is an important biological process, but these are complex and highly interconnected each others. Metabolic network maps are represented by a complex chain of chemical reactions and are highly associated between genes, proteins and enzymes; consequently mathematical and/or computational approaches are necessary for integration of them. Heterogeneous biological data, including genome, transcriptome, proteome, and metabolome are integrated into a pathway-based metabolic model to predict a flux distribution of genetically modified cells under particular conditions. The integration of heterogeneous biological data and model building have become essential activities in biological research as technological advancements continue to empower the measurement of biological data of increasing diversity and scale. But the challenge becomes how to integrate this data to maximize the amount of useful biological information that can be extracted. Metabolic pathway analysis is theoretically effective in integrating heterogeneous biological data into metabolic network and to offer great opportunities for studying functional and structural properties of metabolic pathways. Metabolic pathway analysis has focused on two approaches, namely, elementary modes (EMs) and extreme pathways (Expas). EM analysis is potentially effective in integrating transcriptome or proteome data into metabolic network analyses and a minimal set of reactions that can maintain the steady state level, while Expa analysis is a subset of EM that contains two additional conditions and one of them condition to make all Expas systematically independent. The EM coefficients (EMCs) indicate the quantitative contribution of their associated EMs and that can be estimated by maximizing as a particular objective function. A serious problem of EM/ Expa analysis is that the computational time increases exponentially with an increase in network sizes, which makes the computation of the all EMs/Expas expensive and impracticable for large- or genome-scale networks. Another major problem is that many organisms still does not have provide any specific objective biological function for estimating the EMCs to predict the flux distribution relate to the optimum physiological states and EMs can be described by different scalar products or many possible vectors of each EM, but the predicted flux distributions must be independent of them. To address such aforementioned problems, in this thesis we present a fast and efficient algorithm, called complementary EM (cEM) analysis, to reduce the number of EMs/Expas. To achieve the computational time improvement, we employ the EM decomposition method that explores major EMs or linear combinations of them which are responsible for the metabolic flux distributions. Flux balance analysis (FBA) is used to generate many possible ranges of metabolic flux distributions as the input data, which is necessary for the EM decomposition method. The maximum entropy principle (MEP) is used as an objective function for estimating the coefficients of cEMs, to renounce the scalar product problem of EMs. MEP is widely used for flux prediction in particular cases where no biological objective function is available and most advantages that it does not depend on the scalar product of each EM. To demonstrate the feasibility of cEM analysis, we compared it with EM/Expa analysis by using a simulation study with an artificial metabolic network model and real metabolic network analysis by two medium-scale metabolic network model of E. coli and a genome scale model for head and neck cancer cells. The cEM analysis greatly reduces the number of EM, computational time and memory cost for the genome-scale metabolic network. Application of cEM analysis to Genetic Modification of Flux (GMF) accurately predicts the flux distributions of genetic mutants under particular conditions. Use of cEMs analysis, to plans a genetic engineering strategy for genome-scale metabolic network model for producing useful compounds. Keywords: Systems biotechnology; Integrating biological data; Constraint-based metabolic modeling; Large-scale metabolic network; Elementary mode decomposition; Complementary elementary mode analysis; Quantitative contributions; Prediction speed and accuracy