22 research outputs found

    IHMCIF: An Extension of the PDBx/mmCIF Data Standard for Integrative Structure Determination Methods

    Get PDF
    IHMCIF (github.com/ihmwg/IHMCIF) is a data information framework that supports archiving and disseminating macromolecular structures determined by integrative or hybrid modeling (IHM), and making them Findable, Accessible, Interoperable, and Reusable (FAIR). IHMCIF is an extension of the Protein Data Bank Exchange/macromolecular Crystallographic Information Framework (PDBx/mmCIF) that serves as the framework for the Protein Data Bank (PDB) to archive experimentally determined atomic structures of biological macromolecules and their complexes with one another and small molecule ligands (e.g., enzyme cofactors and drugs). IHMCIF serves as the foundational data standard for the PDB-Dev prototype system, developed for archiving and disseminating integrative structures. It utilizes a flexible data representation to describe integrative structures that span multiple spatiotemporal scales and structural states with definitions for restraints from a variety of experimental methods contributing to integrative structural biology. The IHMCIF extension was created with the benefit of considerable community input and recommendations gathered by the Worldwide Protein Data Bank (wwPDB) Task Force for Integrative or Hybrid Methods (wwpdb.org/task/hybrid). Herein, we describe the development of IHMCIF to support evolving methodologies and ongoing advancements in integrative structural biology. Ultimately, IHMCIF will facilitate the unification of PDB-Dev data and tools with the PDB archive so that integrative structures can be archived and disseminated through PDB

    Engineering Encodable Lanthanide-Binding Tags into Loop Regions of Proteins

    No full text
    Lanthanide-binding tags (LBTs) are valuable tools for investigation of protein structure, function, and dynamics by NMR spectroscopy, X-ray crystallography, and luminescence studies. We have inserted LBTs into three different loop positions (denoted L, R, and S) of the model protein interleukin-1β (IL1β) and varied the length of the spacer between the LBT and the protein (denoted 1−3). Luminescence studies demonstrate that all nine constructs bind Tb3+ tightly in the low nanomolar range. No significant change in the fusion protein occurs from insertion of the LBT, as shown by two X-ray crystallographic structures of the IL1β-S1 and IL1β-L3 constructs and for the remaining constructs by comparing the 1H−15N heteronuclear single-quantum coherence NMR spectra with that of the wild-type IL1β. Additionally, binding of LBT-loop IL1β proteins to their native binding partner in vitro remains unaltered. X-ray crystallographic phasing was successful using only the signal from the bound lanthanide. Large residual dipolar couplings (RDCs) could be determined by NMR spectroscopy for all LBT-loop constructs and revealed that the LBT-2 series were rigidly incorporated into the interleukin-1β structure. The paramagnetic NMR spectra of loop-LBT mutant IL1β-R2 were assigned and the Δχ tensor components were calculated on the basis of RDCs and pseudocontact shifts. A structural model of the IL1β-R2 construct was calculated using the paramagnetic restraints. The current data provide support that encodable LBTs serve as versatile biophysical tags when inserted into loop regions of proteins of known structure or predicted via homology modeling.National Science Foundation (U.S.) (Grant MCB 0744415)Ruth L. Kirschstein National Research Service AwardUnited States. Dept. of Energy (Offices of Biological and Environmental Research and of Basic Energy Sciences)National Institutes of Health (U.S.) (National Center for Research Resources

    PDBx/mmCIF Ecosystem : Foundational Semantic Tools for Structural Biology

    No full text
    PDBx/mmCIF, Protein Data Bank Exchange (PDBx) macromolecular Crystallographic Information Framework (mmCIF), has become the data standard for structural biology. With its early roots in the domain of small-molecule crystallography, PDBx/mmCIF provides an extensible data representation that is used for deposition, archiving, remediation, and public dissemination of experimentally determined three-dimensional (3D) structures of biological macromolecules by the Worldwide Protein Data Bank (wwPDB, wwpdb.org). Extensions of PDBx/mmCIF are similarly used for computed structure models by ModelArchive (modelarchive.org), integrative/hybrid structures by PDB-Dev (pdb-dev.wwpdb.org), small angle scattering data by Small Angle Scattering Biological Data Bank SASBDB (sasbdb.org), and for models computed generated with the AlphaFold 2.0 deep learning software suite (alphafold.ebi.ac.uk). Community-driven development of PDBx/mmCIF spans three decades, involving contributions from researchers, software and methods developers in structural sciences, data repository providers, scientific publishers, and professional societies. Having a semantically rich and extensible data framework for representing a wide range of structural biology experimental and computational results, combined with expertly curated 3D biostructure data sets in public repositories, accelerates the pace of scientific discovery. Herein, we describe the architecture of the PDBx/mmCIF data standard, tools used to maintain representations of the data standard, governance, and processes by which data content standards are extended, plus community tools/software libraries available for processing and checking the integrity of PDBx/mmCIF data. Use cases exemplify how the members of the Worldwide Protein Data Bank have used PDBx/mmCIF as the foundation for its pipeline for delivering Findable, Accessible, Interoperable, and Reusable (FAIR) data to many millions of users worldwide.publishe

    Protein Data Bank: A Comprehensive Review of 3D Structure Holdings and Worldwide Utilization by Researchers, Educators, and Students

    No full text
    The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the United States National Science Foundation, National Institutes of Health, and Department of Energy, supports structural biologists and Protein Data Bank (PDB) data users around the world. The RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, serves as the US data center for the global PDB archive housing experimentally-determined three-dimensional (3D) structure data for biological macromolecules. As the wwPDB-designated Archive Keeper, RCSB PDB is also responsible for the security of PDB data and weekly update of the archive. RCSB PDB serves tens of thousands of data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) annually working on all permanently inhabited continents. RCSB PDB makes PDB data available from its research-focused web portal at no charge and without usage restrictions to many millions of PDB data consumers around the globe. It also provides educators, students, and the general public with an introduction to the PDB and related training materials through its outreach and education-focused web portal. This review article describes growth of the PDB, examines evolution of experimental methods for structure determination viewed through the lens of the PDB archive, and provides a detailed accounting of PDB archival holdings and their utilization by researchers, educators, and students worldwide

    ModelCIF: An extension of PDBx/mmCIF data representation for computed structure models

    No full text
    ModelCIF (github.com/ihmwg/ModelCIF) is a data information framework developed for and by computational structural biologists to enable delivery of Findable, Accessible, Interoperable, and Reusable (FAIR) data to users worldwide. ModelCIF describes the specific set of attributes and metadata associated with macromolecular structures modeled by solely computational methods and provides an extensible data representation for deposition, archiving, and public dissemination of predicted three-dimensional (3D) models of macromolecules. It is an extension of the Protein Data Bank Exchange / macromolecular Crystallographic Information Framework (PDBx/mmCIF), which is the global data standard for representing experimentally-determined 3D structures of macromolecules and associated metadata. The PDBx/mmCIF framework and its extensions (e.g., ModelCIF) are managed by the Worldwide Protein Data Bank partnership (wwPDB, wwpdb.org) in collaboration with relevant community stakeholders such as the wwPDB ModelCIF Working Group (wwpdb.org/task/modelcif). This semantically rich and extensible data framework for representing computed structure models (CSMs) accelerates the pace of scientific discovery. Herein, we describe the architecture, contents, and governance of ModelCIF, and tools and processes for maintaining and extending the data standard. Community tools and software libraries that support ModelCIF are also described
    corecore