223,289 research outputs found

    Genome-scale modeling of the protein secretory machinery in yeast

    Get PDF
    The protein secretory machinery in Eukarya is involved in post-translational modification (PTMs) and sorting of the secretory and many transmembrane proteins. While the secretory machinery has been well-studied using classic reductionist approaches, a holistic view of its complex nature is lacking. Here, we present the first genome-scale model for the yeast secretory machinery which captures the knowledge generated through more than 50 years of research. The model is based on the concept of a Protein Specific Information Matrix (PSIM: characterized by seven PTMs features). An algorithm was developed which mimics secretory machinery and assigns each secretory protein to a particular secretory class that determines the set of PTMs and transport steps specific to each protein. Protein abundances were integrated with the model in order to gain system level estimation of the metabolic demands associated with the processing of each specific protein as well as a quantitative estimation of the activity of each component of the secretory machinery

    OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering.</p> <p>Results</p> <p>OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. </p> <p>Conclusion</p> <p>OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at <url>http://bionlp.sourceforge.net/</url></p

    Data access and integration in the ISPIDER proteomics grid

    Get PDF
    Grid computing has great potential for supporting the integration of complex, fast changing biological data repositories to enable distributed data analysis. One scenario where Grid computing has such potential is provided by proteomics resources which are rapidly being developed with the emergence of affordable, reliable methods to study the proteome. The protein identifications arising from these methods derive from multiple repositories which need to be integrated to enable uniform access to them. A number of technologies exist which enable these resources to be accessed in a Grid environment, but the independent development of these resources means that significant data integration challenges, such as heterogeneity and schema evolution, have to be met. This paper presents an architecture which supports the combined use of Grid data access (OGSA-DAI), Grid distributed querying (OGSA-DQP) and data integration (AutoMed) software tools to support distributed data analysis. We discuss the application of this architecture for the integration of several autonomous proteomics data resources

    Mammalian Brain As a Network of Networks

    Get PDF
    Acknowledgements AZ, SG and AL acknowledge support from the Russian Science Foundation (16-12-00077). Authors thank T. Kuznetsova for Fig. 6.Peer reviewedPublisher PD

    Large-scale event extraction from literature with multi-level gene normalization

    Get PDF
    Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (http://www.evexdb.org/api/v001/). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from http://evexdb.org/download/, under the Creative Commons -Attribution - Share Alike (CC BY-SA) license

    The use of information theory in evolutionary biology

    Full text link
    Information is a key concept in evolutionary biology. Information is stored in biological organism's genomes, and used to generate the organism as well as to maintain and control it. Information is also "that which evolves". When a population adapts to a local environment, information about this environment is fixed in a representative genome. However, when an environment changes, information can be lost. At the same time, information is processed by animal brains to survive in complex environments, and the capacity for information processing also evolves. Here I review applications of information theory to the evolution of proteins as well as to the evolution of information processing in simulated agents that adapt to perform a complex task.Comment: 25 pages, 7 figures. To appear in "The Year in Evolutionary Biology", of the Annals of the NY Academy of Science

    Engineering ligand-responsive RNA controllers in yeast through the assembly of RNase III tuning modules

    Get PDF
    The programming of cellular networks to achieve new biological functions depends on the development of genetic tools that link the presence of a molecular signal to gene-regulatory activity. Recently, a set of engineered RNA controllers was described that enabled predictable tuning of gene expression in the yeast Saccharomyces cerevisiae through directed cleavage of transcripts by an RNase III enzyme, Rnt1p. Here, we describe a strategy for building a new class of RNA sensing-actuation devices based on direct integration of RNA aptamers into a region of the Rnt1p hairpin that modulates Rnt1p cleavage rates. We demonstrate that ligand binding to the integrated aptamer domain is associated with a structural change sufficient to inhibit Rnt1p processing. Three tuning strategies based on the incorporation of different functional modules into the Rnt1p switch platform were demonstrated to optimize switch dynamics and ligand responsiveness. We further demonstrated that these tuning modules can be implemented combinatorially in a predictable manner to further improve the regulatory response properties of the switch. The modularity and tunability of the Rnt1p switch platform will allow for rapid optimization and tailoring of this gene control device, thus providing a useful tool for the design of complex genetic networks in yeast
    corecore