503 research outputs found

    The RCSB Protein Data Bank: views of structural biology for basic and applied research and education.

    Get PDF
    The RCSB Protein Data Bank (RCSB PDB, http://www.rcsb.org) provides access to 3D structures of biological macromolecules and is one of the leading resources in biology and biomedicine worldwide. Our efforts over the past 2 years focused on enabling a deeper understanding of structural biology and providing new structural views of biology that support both basic and applied research and education. Herein, we describe recently introduced data annotations including integration with external biological resources, such as gene and drug databases, new visualization tools and improved support for the mobile web. We also describe access to data files, web services and open access software components to enable software developers to more effectively mine the PDB archive and related annotations. Our efforts are aimed at expanding the role of 3D structure in understanding biology and medicine

    Annotation and Curation of the Protein Data Bank

    Get PDF
    The Protein Data Bank (PDB) is the worldwide repository for experimentally determined 3D structures of biological macromolecules. Established in 1971 with just seven structures, it presently includes more than 56,000 entries. To maintain the highest standards in curation and processing, the members of the worldwide Protein Data Bank (wwPDB) collaborate in data annotation and the development of procedures, tools, and resources. Annotation-related issues, particularly those impacted by new developments
in structural biology, are critically reviewed at in-person and virtual meetings regularly and frequently. Comprehensive documentation of the procedures, formats, and related data dictionaries used in data annotation are available at the wwPDB website(www.wwpdb.org).

Mindful of the impact that changes in annotation procedures or data format may have on users, changes are carefully managed and communicated in a timely fashion. In cases involving complex scientific or policy issues, input is sought from advisory committees, standing task forces, experimental method developers, and community experts. This is exemplified by creation of the recently-released version of the PDB archive which updates and further standardizes database references, small molecule chemistry, biological assemblies, and active sites

    Open Chemistry

    Get PDF
    An invited article on Open Chemistry discussing the importance of Open Access and Open Data and stressing the emerging role of the blogospher

    Chemical databases: curation or integration by user-defined equivalence?

    Get PDF
    There is a wealth of valuable chemical information in publicly available databases for use by scientists undertaking drug discovery. However finite curation resource, limitations of chemical structure software and differences in individual database applications mean that exact chemical structure equivalence between databases is unlikely to ever be a reality. The ability to identify compound equivalence has been made significantly easier by the use of the International Chemical Identifier (InChI), a non-proprietary line-notation for describing a chemical structure. More importantly, advances in methods to identify compounds that are the same at various levels of similarity, such as those containing the same parent component or having the same connectivity, are now enabling related compounds to be linked between databases where the structure matches are not exact

    Designing algorithms to aid discovery by chemical robots

    Get PDF
    Recently, automated robotic systems have become very efficient, thanks to improved coupling between sensor systems and algorithms, of which the latter have been gaining significance thanks to the increase in computing power over the past few decades. However, intelligent automated chemistry platforms for discovery orientated tasks need to be able to cope with the unknown, which is a profoundly hard problem. In this Outlook, we describe how recent advances in the design and application of algorithms, coupled with the increased amount of chemical data available, and automation and control systems may allow more productive chemical research and the development of chemical robots able to target discovery. This is shown through examples of workflow and data processing with automation and control, and through the use of both well-used and cutting-edge algorithms illustrated using recent studies in chemistry. Finally, several algorithms are presented in relation to chemical robots and chemical intelligence for knowledge discovery

    A computational solution to automatically map metabolite libraries in the context of genome scale metabolic networks

    Get PDF
    This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc) and flat file formats (SBML and Matlab files). We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics) and Glasgow Polyomics (GP) on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks. In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks. In order to achieve this goal, we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities
    • …
    corecore