9,884 research outputs found

    Reconstruction of an in silico metabolic model of _Arabidopsis thaliana_ through database integration

    Get PDF
    The number of genome-scale metabolic models has been rising quickly in recent years, and the scope of their utilization encompasses a broad range of applications from metabolic engineering to biological discovery. However the reconstruction of such models remains an arduous process requiring a high level of human intervention. Their utilization is further hampered by the absence of standardized data and annotation formats and the lack of recognized quality and validation standards.

Plants provide a particularly rich range of perspectives for applications of metabolic modeling. We here report the first effort to the reconstruction of a genome-scale model of the metabolic network of the plant _Arabidopsis thaliana_, including over 2300 reactions and compounds. Our reconstruction was performed using a semi-automatic methodology based on the integration of two public genome-wide databases, significantly accelerating the process. Database entries were compared and integrated with each other, allowing us to resolve discrepancies and enhance the quality of the reconstruction. This process lead to the construction of three models based on different quality and validation standards, providing users with the possibility to choose the standard that is most appropriate for a given application. First, a _core metabolic model_ containing only consistent data provides a high quality model that was shown to be stoichiometrically consistent. Second, an _intermediate metabolic model_ attempts to fill gaps and provides better continuity. Third, a _complete metabolic model_ contains the full set of known metabolic reactions and compounds in _Arabidopsis thaliana_.

We provide an annotated SBML file of our core model to enable the maximum level of compatibility with existing tools and databases. We eventually discuss a series of principles to raise awareness of the need to develop coordinated efforts and common standards for the reconstruction of genome-scale metabolic models, with the aim of enabling their widespread diffusion, frequent update, maximum compatibility and convenience of use by the wider research community and industry

    A Novel Bioinformatic Approach to Understanding Addiction

    Get PDF
    Finding the genetic markers that influence complex, multigenic substance addiction phenotypes has been an area of significant medical study. Understanding complex disease traits like addiction has been hampered by the lack of functional insights into novel variants to the human genome. We hypothesized that gene location plays a role in functional genomic neighborhoods. To test whether there is a relationship between opiate, dopamine, and GABA disease and population allele frequencies, we used genes obtained from addiction literature curated by the National Center for Biotechnology Information (NCBI). These addiction and metabolism focused search terms generated opiate, dopamine, and GABA addiction results (N=587 genes). These genes were then projected onto the genome to identify cluster regions of genetic importance for substance addiction. Clusters were defined as regions of the genome with more than six genes within a 1.5Mb linear genomic window. We identified seven hotspots located on chromosomes 4, 6 (2 clusters), 10, 11, and 19. Human polymorphism data was surveyed from the 1148 individuals comprising the 11 sample populations of the HapMap Project dataset. Our analyses demonstrate that when human populations are assessed, ten candidate addiction alleles were identified. Finally assessments of public genome wide association studies show long range linkages to canonical addiction genes. This study delineates a novel method to identify novel candidate addiction variants using a systems biology approach that relies on an interdisciplinary set of data, including genomic, pathway data, and population variation. Important connections to sociological and environmental data are discussed to contextualize addiction data

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    In Silico Genome-Scale Reconstruction and Validation of the Staphylococcus aureus Metabolic Network

    Get PDF
    A genome-scale metabolic model of the Gram-positive, facultative anaerobic opportunistic pathogen Staphylococcus aureus N315 was constructed based on current genomic data, literature, and physiological information. The model comprises 774 metabolic processes representing approximately 23% of all protein-coding regions. The model was extensively validated against experimental observations and it correctly predicted main physiological properties of the wild-type strain, such as aerobic and anaerobic respiration and fermentation. Due to the frequent involvement of S. aureus in hospital-acquired bacterial infections combined with its increasing antibiotic resistance, we also investigated the clinically relevant phenotype of small colony variants and found that the model predictions agreed with recent findings of proteome analyses. This indicates that the model is useful in assisting future experiments to elucidate the interrelationship of bacterial metabolism and resistance. To help directing future studies for novel chemotherapeutic targets, we conducted a large-scale in silico gene deletion study that identified 158 essential intracellular reactions. A more detailed analysis showed that the biosynthesis of glycans and lipids is rather rigid with respect to circumventing gene deletions, which should make these areas particularly interesting for antibiotic development. The combination of this stoichiometric model with transcriptomic and proteomic data should allow a new quality in the analysis of clinically relevant organisms and a more rationalized system-level search for novel drug targets.

    Proteomic study of the membrane components of signalling cascades of Botrytis cinerea controlled by phosphorylation

    Get PDF
    Protein phosphorylation and membrane proteins play an important role in the infection of plants by phytopathogenic fungi, given their involvement in signal transduction cascades. Botrytis cinerea is a well-studied necrotrophic fungus taken as a model organism in fungal plant pathology, given its broad host range and adverse economic impact. To elucidate relevant events during infection, several proteomics analyses have been performed in B. cinerea, but they cover only 10% of the total proteins predicted in the genome database of this fungus. To increase coverage, we analysed by LC-MS/MS the first-reported overlapped proteome in phytopathogenic fungi, the “phosphomembranome” of B. cinerea, combining the two most important signal transduction subproteomes. Of the 1112 membrane-associated phosphoproteins identified, 64 and 243 were classified as exclusively identified or overexpressed under glucose and deproteinized tomato cell wall conditions, respectively. Seven proteins were found under both conditions, but these presented a specific phosphorylation pattern, so they were considered as exclusively identified or overexpressed proteins. From bioinformatics analysis, those differences in the membrane-associated phosphoproteins composition were associated with various processes, including pyruvate metabolism, unfolded protein response, oxidative stress response, autophagy and cell death. Our results suggest these proteins play a significant role in the B. cinerea pathogenic cycl

    Identification of gene pathways implicated in Alzheimer's disease using longitudinal imaging phenotypes with sparse regression

    Get PDF
    We present a new method for the detection of gene pathways associated with a multivariate quantitative trait, and use it to identify causal pathways associated with an imaging endophenotype characteristic of longitudinal structural change in the brains of patients with Alzheimer's disease (AD). Our method, known as pathways sparse reduced-rank regression (PsRRR), uses group lasso penalised regression to jointly model the effects of genome-wide single nucleotide polymorphisms (SNPs), grouped into functional pathways using prior knowledge of gene-gene interactions. Pathways are ranked in order of importance using a resampling strategy that exploits finite sample variability. Our application study uses whole genome scans and MR images from 464 subjects in the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. 66,182 SNPs are mapped to 185 gene pathways from the KEGG pathways database. Voxel-wise imaging signatures characteristic of AD are obtained by analysing 3D patterns of structural change at 6, 12 and 24 months relative to baseline. High-ranking, AD endophenotype-associated pathways in our study include those describing chemokine, Jak-stat and insulin signalling pathways, and tight junction interactions. All of these have been previously implicated in AD biology. In a secondary analysis, we investigate SNPs and genes that may be driving pathway selection, and identify a number of previously validated AD genes including CR1, APOE and TOMM40

    Methods for Joint Normalization and Comparison of Hi-C data

    Get PDF
    The development of chromatin conformation capture technology has opened new avenues of study into the 3D structure and function of the genome. Chromatin structure is known to influence gene regulation, and differences in structure are now emerging as a mechanism of regulation between, e.g., cell differentiation and disease vs. normal states. Hi-C sequencing technology now provides a way to study the 3D interactions of the chromatin over the whole genome. However, like all sequencing technologies, Hi-C suffers from several forms of bias stemming from both the technology and the DNA sequence itself. Several normalization methods have been developed for normalizing individual Hi-C datasets, but little work has been done on developing joint normalization methods for comparing two or more Hi-C datasets. To make full use of Hi-C data, joint normalization and statistical comparison techniques are needed to carry out experiments to identify regions where chromatin structure differs between conditions. We develop methods for the joint normalization and comparison of two Hi-C datasets, which we then extended to more complex experimental designs. Our normalization method is novel in that it makes use of the distance-dependent nature of chromatin interactions. Our modification of the Minus vs. Average (MA) plot to the Minus vs. Distance (MD) plot allows for a nonparametric data-driven normalization technique using loess smoothing. Additionally, we present a simple statistical method using Z-scores for detecting differentially interacting regions between two datasets. Our initial method was published as the Bioconductor R package HiCcompare [http://bioconductor.org/packages/HiCcompare/](http://bioconductor.org/packages/HiCcompare/). We then further extended our normalization and comparison method for use in complex Hi-C experiments with more than two datasets and optional covariates. We extended the normalization method to jointly normalize any number of Hi-C datasets by using a cyclic loess procedure on the MD plot. The cyclic loess normalization technique can remove between dataset biases efficiently and effectively even when several datasets are analyzed at one time. Our comparison method implements a generalized linear model-based approach for comparing complex Hi-C experiments, which may have more than two groups and additional covariates. The extended methods are also available as a Bioconductor R package [http://bioconductor.org/packages/multiHiCcompare/](http://bioconductor.org/packages/multiHiCcompare/). Finally, we demonstrate the use of HiCcompare and multiHiCcompare in several test cases on real data in addition to comparing them to other similar methods (https://doi.org/10.1002/cpbi.76)
    corecore