30 research outputs found

    The Registry of Scientometric Data Sources

    Full text link
    In this article, we describe the Registry of Scientometric Data Sources (RSDS) and several scientometric data sources recorded in this open registry that could be of interest for scientometricians, institutional researchers, librarians, practitioners, policy makers, students and other stakeholders with an interest in scientometrics. This registry was created after carrying out a literature review and a technical evaluation of several data sources. Each data source is recorded with descriptive metadata fields and URLs to further information. This article describes the motivation behind the development of the registry, explains the features that are available on its public website (https://labs.tib.eu/rosi), and closes with a call for participation.Comment: 20 pages, 4 figures, 1 tabl

    Evaluating the availability of open citation data

    Get PDF
    Citation data of scientific publications are essential for different purposes, such as evaluating research and building digital library collections. In this paper, we analyze to which extent citation data of publications are openly available, using the intersection of the Crossref metadata and unpaywall snapshot as publication dataset and the COCI dataset as open citation data. We reveal that for 24.2% of the publications, the citation data is openly available, while for 16.6%, the citation data is closed. We find that the percentage of publications with open citation data has increased over the years. We observe that whether publications are published with open access has no influence on whether their citations are openly available. However, publications published in journals from the Directory of Open Access Journals (DOAJ) tend to have more citation data openly available than publications from other journals

    Identifying and correcting invalid citations due to DOI errors in Crossref data

    Get PDF
    This work aims to identify classes of DOI mistakes by analysing the open bibliographic metadata available in Crossref, highlighting which publishers were responsible for such mistakes and how many of these incorrect DOIs could be corrected through automatic processes. By using a list of invalid cited DOIs gathered by OpenCitations while processing the OpenCitations Index of Crossref open DOI-to-DOI citations (COCI) in the past two years, we retrieved the citations in the January 2021 Crossref dump to such invalid DOIs. We processed these citations by keeping track of their validity and the publishers responsible for uploading the related citation data in Crossref. Finally, we identified patterns of factual errors in the invalid DOIs and the regular expressions needed to catch and correct them. The outcomes of this research show that only a few publishers were responsible for and/or affected by the majority of invalid citations. We extended the taxonomy of DOI name errors proposed in past studies and defined more elaborated regular expressions that can clean a higher number of mistakes in invalid DOIs than prior approaches. The data gathered in our study can enable investigating possible reasons for DOI mistakes from a qualitative point of view, helping publishers identify the problems underlying their production of invalid citation data. Also, the DOI cleaning mechanism we present could be integrated into the existing process (e.g. in COCI) to add citations by automatically correcting a wrong DOI. This study was run strictly following Open Science principles, and, as such, our research outcomes are fully reproducible

    Identifying and correcting invalid citations due to DOI errors in Crossref data

    Get PDF
    This work aims to identify classes of DOI mistakes by analysing the open bibliographic metadata available in Crossref, highlighting which publishers were responsible for such mistakes and how many of these incorrect DOIs could be corrected through automatic processes. By using a list of invalid cited DOIs gathered by OpenCitations while processing the OpenCitations Index of Crossref open DOI-to-DOI citations (COCI) in the past two years, we retrieved the citations in the January 2021 Crossref dump to such invalid DOIs. We processed these citations by keeping track of their validity and the publishers responsible for uploading the related citation data in Crossref. Finally, we identified patterns of factual errors in the invalid DOIs and the regular expressions needed to catch and correct them. The outcomes of this research show that only a few publishers were responsible for and/or affected by the majority of invalid citations. We extended the taxonomy of DOI name errors proposed in past studies and defined more elaborated regular expressions that can clean a higher number of mistakes in invalid DOIs than prior approaches. The data gathered in our study can enable investigating possible reasons for DOI mistakes from a qualitative point of view, helping publishers identify the problems underlying their production of invalid citation data. Also, the DOI cleaning mechanism we present could be integrated into the existing process (e.g. in COCI) to add citations by automatically correcting a wrong DOI. This study was run strictly following Open Science principles, and, as such, our research outcomes are fully reproducible

    Identifying and correcting invalid citations due to DOI errors in Crossref data

    Get PDF
    This work aims to identify classes of DOI mistakes by analysing the open bibliographic metadata available in Crossref, highlighting which publishers were responsible for such mistakes and how many of these incorrect DOIs could be corrected through automatic processes. By using a list of invalid cited DOIs gathered by OpenCitations while processing the OpenCitations Index of Crossref open DOI-to-DOI citations (COCI) in the past two years, we retrieved the citations in the January 2021 Crossref dump to such invalid DOIs. We processed these citations by keeping track of their validity and the publishers responsible for uploading the related citation data in Crossref. Finally, we identified patterns of factual errors in the invalid DOIs and the regular expressions needed to catch and correct them. The outcomes of this research show that only a few publishers were responsible for and/or affected by the majority of invalid citations. We extended the taxonomy of DOI name errors proposed in past studies and defined more elaborated regular expressions that can clean a higher number of mistakes in invalid DOIs than prior approaches. The data gathered in our study can enable investigating possible reasons for DOI mistakes from a qualitative point of view, helping publishers identify the problems underlying their production of invalid citation data. Also, the DOI cleaning mechanism we present could be integrated into the existing process (e.g. in COCI) to add citations by automatically correcting a wrong DOI. This study was run strictly following Open Science principles, and, as such, our research outcomes are fully reproducible

    A Programming Interface for Creating Data According to the SPAR Ontologies and the OpenCitations Data Model

    Get PDF
    The OpenCitations Data Model (OCDM) is a data model for bibliographic metadata and citations based on the SPAR Ontologies and developed by OpenCitations to expose all the data of its collections as sets of RDF statements compliant with an ontology named OpenCitations Ontology. In this paper, we introduce oc_ocdm, i.e. a Python library developed for creating OCDM-compliant RDF data even if the programmer has no expertise in Semantic Web technologies. After an introduction of the library and its main characteristics, we show a number of projects within the OpenCitations infrastructure that adopt it as their building block unit

    A qualitative and quantitative analysis of open citations to retracted articles: the Wakefield 1998 et al.'s case

    Get PDF
    In this article, we show the results of a quantitative and qualitative analysis of open citations on a popular and highly cited retracted paper: “Ileal-lymphoid-nodular hyperplasia, non-specific colitis and pervasive developmental disorder in children” by Wakefield et al., published in 1998. The main purpose of our study is to understand the behavior of the publications citing one retracted article and the characteristics of the citations the retracted article accumulated over time. Our analysis is based on a methodology which illustrates how we gathered the data, extracted the topics of the citing articles and visualized the results. The data and services used are all open and free to foster the reproducibility of the analysis. The outcomes concerned the analysis of the entities citing Wakefield et al.’s article and their related in-text citations. We observed a constant increasing number of citations in the last 20 years, accompanied with a constant increment in the percentage of those acknowledging its retraction. Citing articles have started either discussing or dealing with the retraction of Wakefield et al.’s article even before its full retraction happened in 2010. Articles in the social sciences domain citing the Wakefield et al.’s one were among those that have mostly discussed its retraction. In addition, when observing the in-text citations, we noticed that a large number of the citations received by Wakefield et al.’s article has focused on general discussions without recalling strictly medical details, especially after the full retraction. Medical studies did not hesitate in acknowledging the retraction of the Wakefield et al.’s article and often provided strong negative statements on it
    corecore