12,227 research outputs found

    To share or not to share: Publication and quality assurance of research data outputs. A report commissioned by the Research Information Network

    No full text
    A study on current practices with respect to data creation, use, sharing and publication in eight research disciplines (systems biology, genomics, astronomy, chemical crystallography, rural economy and land use, classics, climate science and social and public health science). The study looked at data creation and care, motivations for sharing data, discovery, access and usability of datasets and quality assurance of data in each discipline

    Specimens as research objects: reconciliation across distributed repositories to enable metadata propagation

    Full text link
    Botanical specimens are shared as long-term consultable research objects in a global network of specimen repositories. Multiple specimens are generated from a shared field collection event; generated specimens are then managed individually in separate repositories and independently augmented with research and management metadata which could be propagated to their duplicate peers. Establishing a data-derived network for metadata propagation will enable the reconciliation of closely related specimens which are currently dispersed, unconnected and managed independently. Following a data mining exercise applied to an aggregated dataset of 19,827,998 specimen records from 292 separate specimen repositories, 36% or 7,102,710 specimens are assessed to participate in duplication relationships, allowing the propagation of metadata among the participants in these relationships, totalling: 93,044 type citations, 1,121,865 georeferences, 1,097,168 images and 2,191,179 scientific name determinations. The results enable the creation of networks to identify which repositories could work in collaboration. Some classes of annotation (particularly those regarding scientific name determinations) represent units of scientific work: appropriate management of this data would allow the accumulation of scholarly credit to individual researchers: potential further work in this area is discussed.Comment: 9 pages, 1 table, 3 figure

    Curating E-Mails; A life-cycle approach to the management and preservation of e-mail messages

    Get PDF
    E-mail forms the backbone of communications in many modern institutions and organisations and is a valuable type of organisational, cultural, and historical record. Successful management and preservation of valuable e-mail messages and collections is therefore vital if organisational accountability is to be achieved and historical or cultural memory retained for the future. This requires attention by all stakeholders across the entire life-cycle of the e-mail records. This instalment of the Digital Curation Manual reports on the several issues involved in managing and curating e-mail messages for both current and future use. Although there is no 'one-size-fits-all' solution, this instalment outlines a generic framework for e-mail curation and preservation, provides a summary of current approaches, and addresses the technical, organisational and cultural challenges to successful e-mail management and longer-term curation.

    A text-mining system for extracting metabolic reactions from full-text articles

    Get PDF
    Background: Increasingly biological text mining research is focusing on the extraction of complex relationships relevant to the construction and curation of biological networks and pathways. However, one important category of pathway—metabolic pathways—has been largely neglected. Here we present a relatively simple method for extracting metabolic reaction information from free text that scores different permutations of assigned entities (enzymes and metabolites) within a given sentence based on the presence and location of stemmed keywords. This method extends an approach that has proved effective in the context of the extraction of protein–protein interactions. Results: When evaluated on a set of manually-curated metabolic pathways using standard performance criteria, our method performs surprisingly well. Precision and recall rates are comparable to those previously achieved for the well-known protein-protein interaction extraction task. Conclusions: We conclude that automated metabolic pathway construction is more tractable than has often been assumed, and that (as in the case of protein–protein interaction extraction) relatively simple text-mining approaches can prove surprisingly effective. It is hoped that these results will provide an impetus to further research and act as a useful benchmark for judging the performance of more sophisticated methods that are yet to be developed

    Theory and Practice of Data Citation

    Full text link
    Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets. Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has. The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products. Many works in recent years have discussed data citation from different viewpoints: illustrating why data citation is needed, defining the principles and outlining recommendations for data citation systems, and providing computational methods for addressing specific issues of data citation. The current panorama is many-faceted and an overall view that brings together diverse aspects of this topic is still missing. Therefore, this paper aims to describe the lay of the land for data citation, both from the theoretical (the why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association for Information Science and Technology (JASIST), 201
    • …
    corecore