82 research outputs found

    On vital aid: the why, what and how of validation

    Get PDF
    The need for validation of macromolecular crystal structures is discussed. A general approach to validation is presented, together with examples of its implementation in the special case of macromolecular crystallography

    Case-controlled structure validation.

    Get PDF
    Although many factors influence the quality of a macromolecular crystal structure, validation criteria are usually only calibrated using one of these factors, the resolution. For many purposes this is sufficient, but there are times when one wishes to compare one set of structures with another and the comparison may be invalidated by systematic differences between the sets in factors other than resolution. This problem can be circumvented by borrowing from medicine the idea of the case-matched control: each structure of interest is matched with a control structure that has similar values for all relevant factors considered in this study. In addition to resolution, these include the size of the structure (as measured by the volume of the asymmetric unit) and the year of deposition. This approach has been applied to address two questions: whether structures from structural genomics efforts reach the same level of quality as structures from traditional sources and whether the impact factor of the journal in which a structure is published correlates with structure quality. In both cases, once factors influencing quality have been controlled in the comparison, there is little evidence for a systematic difference in quality

    The Protein Data Bank archive as an open data resource

    Full text link

    PDBe: towards reusable data delivery infrastructure at protein data bank in Europe

    Get PDF
    © 2017 The Authors. Published by OUP. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1093/nar/gkx1070The Protein Data Bank in Europe (PDBe, pdbe.org) is actively engaged in the deposition, annotation, remediation, enrichment and dissemination of macromolecular structure data. This paper describes new developments and improvements at PDBe addressing three challenging areas: data enrichment, data dissemination and functional reusability. New features of the PDBe Web site are discussed, including a context dependent menu providing links to raw experimental data and improved presentation of structures solved by hybrid methods. The paper also summarizes the features of the LiteMol suite, which is a set of services enabling fast and interactive 3D visualization of structures, with associated experimental maps, annotations and quality assessment information. We introduce a library of Web components which can be easily reused to port data and functionality available at PDBe to other services. We also introduce updates to the SIFTS resource which maps PDB data to other bioinformatics resources, and the PDBe REST API.Wellcome Trust [104948]; UK Biotechnology and Biological Sciences Research Council [BB/M011674/1, BB/N019172/1, BB/M020347/1]; European Union [284209]; European Molecular Biology Laboratory (EMBL). Funding for open access charge: EMBL.Published versio

    MIFA: Metadata, Incentives, Formats, and Accessibility guidelines to improve the reuse of AI datasets for bioimage analysis

    Full text link
    Artificial Intelligence methods are powerful tools for biological image analysis and processing. High-quality annotated images are key to training and developing new methods, but access to such data is often hindered by the lack of standards for sharing datasets. We brought together community experts in a workshop to develop guidelines to improve the reuse of bioimages and annotations for AI applications. These include standards on data formats, metadata, data presentation and sharing, and incentives to generate new datasets. We are positive that the MIFA (Metadata, Incentives, Formats, and Accessibility) recommendations will accelerate the development of AI tools for bioimage analysis by facilitating access to high quality training data.Comment: 16 pages, 3 figure

    Genome3D: exploiting structure to help users understand their sequences.

    Get PDF
    Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models
    • …
    corecore