43,415 research outputs found

    F1000 recommendations as a new data source for research evaluation: A comparison with citations

    Get PDF
    F1000 is a post-publication peer review service for biological and medical research. F1000 aims to recommend important publications in the biomedical literature, and from this perspective F1000 could be an interesting tool for research evaluation. By linking the complete database of F1000 recommendations to the Web of Science bibliographic database, we are able to make a comprehensive comparison between F1000 recommendations and citations. We find that about 2% of the publications in the biomedical literature receive at least one F1000 recommendation. Recommended publications on average receive 1.30 recommendations, and over 90% of the recommendations are given within half a year after a publication has appeared. There turns out to be a clear correlation between F1000 recommendations and citations. However, the correlation is relatively weak, at least weaker than the correlation between journal impact and citations. More research is needed to identify the main reasons for differences between recommendations and citations in assessing the impact of publications

    Large-scale event extraction from literature with multi-level gene normalization

    Get PDF
    Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (http://www.evexdb.org/api/v001/). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from http://evexdb.org/download/, under the Creative Commons -Attribution - Share Alike (CC BY-SA) license

    Do peers see more in a paper than its authors?

    Get PDF
    Recent years have shown a gradual shift in the content of biomedical publications that is freely accessible, from titles and abstracts to full text. This has enabled new forms of automatic text analysis and has given rise to some interesting questions: How informative is the abstract compared to the full-text? What important information in the full-text is not present in the abstract? What should a good summary contain that is not already in the abstract? Do authors and peers see an article differently? We answer these questions by comparing the information content of the abstract to that in citances-sentences containing citations to that article. We contrast the important points of an article as judged by its authors versus as seen by peers. Focusing on the area of molecular interactions, we perform manual and automatic analysis, and we find that the set of all citances to a target article not only covers most information (entities, functions, experimental methods, and other biological concepts) found in its abstract, but also contains 20% more concepts. We further present a detailed summary of the differences across information types, and we examine the effects other citations and time have on the content of citances

    The STROBE extensions: protocol for a qualitative assessment of content and a survey of endorsement

    Get PDF
    Introduction The STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) Statement was developed in response to inadequate reporting of observational studies. In recent years, several extensions to STROBE have been created to provide more nuanced field-specific guidance for authors. The content and the prevalence of extension endorsement have not yet been assessed. Accordingly, there are two aims: (1) to classify changes made in the extensions to identify strengths and weaknesses of the original STROBE checklist and (2) to determine the prevalence and typology of endorsement by journals in fields related to extensions. Methods and analysis Two independent researchers will assess additions in each extension. Additions will be coded as â field specific' (FS) or â not field specific' (NFS). FS is defined as particularly relevant information for a single field and guidance provided generally cannot be extrapolated beyond that field. NFS is defined as information that reflects epidemiological or methodological tenets and can be generalised to most, if not all, types of observational research studies. Intraclass correlation will be calculated to measure reviewers' concordance. On disagreement, consensus will be sought. Individual additions will be grouped by STROBE checklist items to identify the frequency and distribution of changes. Journals in fields related to extensions will be identified through National Library of Medicine PubMed Broad Subject Terms, screened for eligibility and further distilled via Ovid MEDLINEŽ search strategies for observational studies. Text describing endorsement will be extracted from each journal's website. A classification scheme will be created for endorsement types and the prevalence of endorsement will be estimated. Analyses will use NVivo V.11 and SAS University Edition. Ethics and dissemination This study does not require ethical approval as it does not involve human participants. This study has been preregistered on Open Science Framework.Peer ReviewedPostprint (author's final draft
    • …
    corecore