Search CORE

84 research outputs found

On vital aid: the why, what and how of validation

Author: Kleywegt Gerard J.
Publication venue: International Union of Crystallography
Publication date
Field of study

The need for validation of macromolecular crystal structures is discussed. A general approach to validation is presented, together with examples of its implementation in the special case of macromolecular crystallography

Crossref

PubMed Central

Case-controlled structure validation.

Author: Kleywegt Gerard J
Read Randy J
Publication venue: Acta Crystallogr D Biol Crystallogr
Publication date: 20/01/2009
Field of study

Although many factors influence the quality of a macromolecular crystal structure, validation criteria are usually only calibrated using one of these factors, the resolution. For many purposes this is sufficient, but there are times when one wishes to compare one set of structures with another and the comparison may be invalidated by systematic differences between the sets in factors other than resolution. This problem can be circumvented by borrowing from medicine the idea of the case-matched control: each structure of interest is matched with a control structure that has similar values for all relevant factors considered in this study. In addition to resolution, these include the size of the structure (as measured by the volume of the asymmetric unit) and the year of deposition. This approach has been applied to address two questions: whether structures from structural genomics efforts reach the same level of quality as structures from traditional sources and whether the impact factor of the journal in which a structure is published correlates with structure quality. In both cases, once factors influencing quality have been controlled in the comparison, there is little evidence for a systematic difference in quality

PubMed Central

Apollo (Cambridge)

The Protein Data Bank archive as an open data resource

Author: Gerard J. Kleywegt
Haruki Nakamura
Helen M. Berman
John L. Markley
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

PDBe: towards reusable data delivery infrastructure at protein data bank in Europe

Author: Alhroub Younes
Anyango Stephen
Armstrong David R
Berrisford John M
Clark Alice R
Conroy Matthew J
Dana Jose M
Deshpande Mandar
Gupta Deepti
Gutmanas Aleksandras
Haslam Pauline
Kleywegt Gerard J
Mak Lora
Mir Saqib
Mukhopadhyay Abhik
Nadzirin Nurul
Paysan-Lafosse Typhaine
Sehnal David
Sen Sanchayita
Smart Oliver S
Varadi Mihaly
Velankar Sameer
Publication venue: 'Oxford University Press (OUP)'
Publication date: 26/10/2017
Field of study

© 2017 The Authors. Published by OUP. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1093/nar/gkx1070The Protein Data Bank in Europe (PDBe, pdbe.org) is actively engaged in the deposition, annotation, remediation, enrichment and dissemination of macromolecular structure data. This paper describes new developments and improvements at PDBe addressing three challenging areas: data enrichment, data dissemination and functional reusability. New features of the PDBe Web site are discussed, including a context dependent menu providing links to raw experimental data and improved presentation of structures solved by hybrid methods. The paper also summarizes the features of the LiteMol suite, which is a set of services enabling fast and interactive 3D visualization of structures, with associated experimental maps, annotations and quality assessment information. We introduce a library of Web components which can be easily reused to port data and functionality available at PDBe to other services. We also introduce updates to the SIFTS resource which maps PDB data to other bioinformatics resources, and the PDBe REST API.Wellcome Trust [104948]; UK Biotechnology and Biological Sciences Research Council [BB/M011674/1, BB/N019172/1, BB/M020347/1]; European Union [284209]; European Molecular Biology Laboratory (EMBL). Funding for open access charge: EMBL.Published versio

Crossref

Wolverhampton Intellectual Repository and E-theses

MIFA: Metadata, Incentives, Formats, and Accessibility guidelines to improve the reuse of AI datasets for bioimage analysis

Artificial Intelligence methods are powerful tools for biological image analysis and processing. High-quality annotated images are key to training and developing new methods, but access to such data is often hindered by the lack of standards for sharing datasets. We brought together community experts in a workshop to develop guidelines to improve the reuse of bioimages and annotations for AI applications. These include standards on data formats, metadata, data presentation and sharing, and incentives to generate new datasets. We are positive that the MIFA (Metadata, Incentives, Formats, and Accessibility) recommendations will accelerate the development of AI tools for bioimage analysis by facilitating access to high quality training data.Comment: 16 pages, 3 figure

arXiv.org e-Print Archive

Genome3D: exploiting structure to help users understand their sequences.

Author: Andreeva Antonina
Blundell Tom L
Buchan Daniel WA
Chothia Cyrus
Cozzetto Domenico
Dana José M
Filippis Ioannis
Gough Julian
Jones David T
Kelley Lawrence A
Kleywegt Gerard J
Lewis Tony E
Minneci Federico
Mistry Jaina
Murzin Alexey G
Oates Matt E
Ochoa-Montaño Bernardo
Orengo Christine
Punta Marco
Rackham Owen JL
Sillitoe Ian
Stahlhacke Jonathan
Sternberg Michael JE
Velankar Sameer
Publication venue: Nucleic Acids Res
Publication date: 27/10/2014
Field of study

Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models

Goldsmiths Research Online

Southampton (e-Prints Soton)

Crossref

PubMed Central

UCL Discovery

Spiral - Imperial College Digital Repository

Apollo (Cambridge)