13,711 research outputs found

    The Protein Data Bank archive as an open data resource

    Full text link

    The RCSB Protein Data Bank: views of structural biology for basic and applied research and education.

    Get PDF
    The RCSB Protein Data Bank (RCSB PDB, http://www.rcsb.org) provides access to 3D structures of biological macromolecules and is one of the leading resources in biology and biomedicine worldwide. Our efforts over the past 2 years focused on enabling a deeper understanding of structural biology and providing new structural views of biology that support both basic and applied research and education. Herein, we describe recently introduced data annotations including integration with external biological resources, such as gene and drug databases, new visualization tools and improved support for the mobile web. We also describe access to data files, web services and open access software components to enable software developers to more effectively mine the PDB archive and related annotations. Our efforts are aimed at expanding the role of 3D structure in understanding biology and medicine

    The Dawn of Open Access to Phylogenetic Data

    Get PDF
    The scientific enterprise depends critically on the preservation of and open access to published data. This basic tenet applies acutely to phylogenies (estimates of evolutionary relationships among species). Increasingly, phylogenies are estimated from increasingly large, genome-scale datasets using increasingly complex statistical methods that require increasing levels of expertise and computational investment. Moreover, the resulting phylogenetic data provide an explicit historical perspective that critically informs research in a vast and growing number of scientific disciplines. One such use is the study of changes in rates of lineage diversification (speciation - extinction) through time. As part of a meta-analysis in this area, we sought to collect phylogenetic data (comprising nucleotide sequence alignment and tree files) from 217 studies published in 46 journals over a 13-year period. We document our attempts to procure those data (from online archives and by direct request to corresponding authors), and report results of analyses (using Bayesian logistic regression) to assess the impact of various factors on the success of our efforts. Overall, complete phylogenetic data for ~60% of these studies are effectively lost to science. Our study indicates that phylogenetic data are more likely to be deposited in online archives and/or shared upon request when: (1) the publishing journal has a strong data-sharing policy; (2) the publishing journal has a higher impact factor, and; (3) the data are requested from faculty rather than students. Although the situation appears dire, our analyses suggest that it is far from hopeless: recent initiatives by the scientific community -- including policy changes by journals and funding agencies -- are improving the state of affairs

    Integration of Biological Sources: Exploring the Case of Protein Homology

    Get PDF
    Data integration is a key issue in the domain of bioin- formatics, which deals with huge amounts of heteroge- neous biological data that grows and changes rapidly. This paper serves as an introduction in the field of bioinformatics and the biological concepts it deals with, and an exploration of the integration problems a bioinformatics scientist faces. We examine ProGMap, an integrated protein homology system used by bioin- formatics scientists at Wageningen University, and several use cases related to protein homology. A key issue we identify is the huge manual effort required to unify source databases into a single resource. Un- certain databases are able to contain several possi- ble worlds, and it has been proposed that they can be used to significantly reduce initial integration efforts. We propose several directions for future work where uncertain databases can be applied to bioinformatics, with the goal of furthering the cause of bioinformatics integration

    RCSB PDB Mobile: iOS and Android mobile apps to provide data access and visualization to the RCSB Protein Data Bank.

    Get PDF
    SummaryThe Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) resource provides tools for query, analysis and visualization of the 3D structures in the PDB archive. As the mobile Web is starting to surpass desktop and laptop usage, scientists and educators are beginning to integrate mobile devices into their research and teaching. In response, we have developed the RCSB PDB Mobile app for the iOS and Android mobile platforms to enable fast and convenient access to RCSB PDB data and services. Using the app, users from the general public to expert researchers can quickly search and visualize biomolecules, and add personal annotations via the RCSB PDB's integrated MyPDB service.Availability and implementationRCSB PDB Mobile is freely available from the Apple App Store and Google Play (http://www.rcsb.org)

    Harnessing and Sharing the Benefits of State Sponsored Research

    Get PDF
    In recent years data-sharing has been a recurring focus of struggle within the scientific research community as improvements in information technology and digital networks have expanded the ways that data can be produced, disseminated, and used. Information technology makes it easier to share data in publicly accessible archives that aggregate data from multiple sources. Such sharing and aggregation facilitate observations that would otherwise be impossible. But data disclosure poses a dilemma for scientists. Data have long been the stock in trade of working scientists, lending credibility to their claims while highlighting new questions that are worthy of future research funding. Some disclosure is necessary in order to claim these benefits, but data disclosure may also benefit one\u27s research competitors. Scientists who share their data promptly and freely may find themselves at a competitive disadvantage relative to free riders in the race to make future observations and thereby to earn further recognition and funding. The possibility of commercial gain further raises the competitive stakes. This article discusses data sharing in California\u27s stem cell initiative against the background of other data sharing efforts and in light of the competing interests that the California Institute for Regenerative Medicine (CIRM) is directed to balance. We begin by considering how IP law affects data-sharing. We then assess the strategic considerations that guide the IP and data policies and strategies of federal, state, and private research sponsors. With this background, we discuss four specific sets of issues that public sponsors of data-rich research, including CIRM, are likely to confront: (1) how to motivate researchers to contribute data; (2) who may have access to the data and on what conditions; (3) what data get deposited and when do they get deposited; and (4) how to establish database architecture and curate and maintain the database
    • ā€¦
    corecore