471,558 research outputs found

    Current state of Linked Data in digital libraries

    Get PDF
    The Semantic Web encourages institutions, including libraries, to collect, link and share their data across the Web in order to ease its processing by machines to get better queries and results. Linked Data technologies enable us to connect related data on the Web using the principles outlined by Tim Berners-Lee in 2006. Digital libraries have great potential to exchange and disseminate data linked to external resources using Linked Data. In this paper, a study about the current uses of Linked Data in digital libraries, including the most important implementations around the world, is presented. The study focuses on selected vocabularies and ontologies, benefits and problems encountered in implementing Linked Data in digital libraries. In addition, it also identifies and discusses specific challenges that digital libraries face, offering suggestions for ways in which libraries can contribute to the Semantic Web. The study uses an adapted methodology for literature review, to find data available to answer research questions. It is based on the information found in the library websites recommended by W3C Library Linked Data Incubator Group in 2011, and scientific publications from Google Scholar, Scopus, ACM and Springer from the last 5 years. The selected libraries for the study are the National Library of France, the Europeana Library, the Library of Congress of the USA, the British Library and the National Library of Spain. In this paper, we outline the best practices found in each experience and identify gaps and future trends.This work was supported by the Prometeo Project from the Secretary of Higher Education, Science, Technology and Innovation (SENESCYT) of the Ecuadorian Government and by the project GEODAS-BI (TIN2012-37493-C03-03) supported by the Ministry of Economy and Competitiveness of Spain (MINECO). Alejandro MateÂŽ was funded by the Generalitat Valenciana (APOSTD/2014/064)

    An Approach to Publish Statistics from Open-Access Journals Using Linked Data Technologies

    Get PDF
    Semantic Web encourages digital libraries which include open access journals, to collect, link and share their data across the web in order to ease its processing by machines and humans to get better queries and results. Linked Data technologies enable connecting structured data across the web using the principles and recommendations set out by Tim Berners-Lee in 2006. Several universities develop knowledge, through scholarship and research, under open access policies and use several ways to disseminate information. Open access journals collect, preserve and publish scientific information in digital form using a peer review process. The evaluation of the usage of this kind of publications needs to be expressed in statistics and linked to external resources to give better information about the resources and their relationships. The statistics expressed in a data mart facilitate queries about the history of journals usage by several criteria. This data linked to another datasets gives more information such as: the topics in the research, the origin of the authors, the relation to the national plans, and the relations about the study curriculums. This paper reports a process for publishing an open access journal data mart on the Web using Linked Data technologies in such a way that it can be linked to related datasets. Furthermore, methodological guidelines are presented with related activities. The proposed process was applied extracting statistical data from a university open journal system and publishing it in a SPARQL endpoint using the open source edition of the software OpenLink Virtuoso. In this process the use of open standards facilitates the creation, development and exploitation of knowledge. The RDF Data Cube vocabulary has been used as a model for publishing the multidimensional data on the Web. The visualization was made using CubeViz a faceted browser filtering observations to be presented interactively in charts. The proposed process help to publish statistical datasets in an easy way.This work has been partially supported by the Prometeo Project by SENESCYT, Ecuadorian Government

    Confidentiality considerations for use of social-spatial data on the social determinants of health: Sexual and reproductive health case study

    Get PDF
    Understanding whether and how the places where people live, work, and play are associated with health behaviors and health is essential to understanding the social determinants of health. However, social-spatial data which link a person and their attributes to a geographic location (e.g., home address) create potential confidentiality risks. Despite the growing body of literature describing approaches to protect individual confidentiality when utilizing social-spatial data, peer-reviewed manuscripts displaying identifiable individual point data or quasi-identifiers (attributes associated with the individual or disease that narrow identification) in maps persist, suggesting that knowledge has not been effectively translated into public health research practices. Using sexual and reproductive health as a case study, we explore the extent to which maps appearing in recent peer-reviewed publications risk participant confidentiality. Our scoping review of sexual and reproductive health literature published and indexed in PubMed between January 1, 2013 and September 1, 2015 identified 45 manuscripts displaying participant data in maps as points or small-population geographic units, spanning 26 journals and representing studies conducted in 20 countries. Notably, 56% (13/23) of publications presenting point data on maps either did not describe approaches used to mask data or masked data inadequately. Furthermore, 18% (4/22) of publications displaying data using small-population geographic units included at least two quasi-identifiers. These findings highlight the need for heightened education for researchers, reviewers, and editorial teams. We aim to provide readers with a primer on key confidentiality considerations when utilizing linked social-spatial data for visualizing results. Given the widespread availability of place-based data and the ease of creating maps, it is critically important to raise awareness on when social-spatial data constitute protected health information, best practices for masking geographic identifiers, and methods of balancing disclosure risk and scientific utility. We conclude with recommendations to support the preservation of confidentiality when disseminating results

    TechMiner: Extracting Technologies from Academic Publications

    Get PDF
    In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture ‘standard’ scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. The resulting knowledge base can support a number of tasks, such as: richer semantic search, which can exploit the technology dimension to support better retrieval of publications; richer expert search; monitoring the emergence and impact of new technologies, both within and across scientific fields; studying the scholarly dynamics associated with the emergence of new technologies; and others. TechMiner was evaluated on a manually annotated gold standard and the results indicate that it significantly outperforms alternative NLP approaches and that its semantic features improve performance significantly with respect to both recall and precision

    A-posteriori provenance-enabled linking of publications and datasets via crowdsourcing

    No full text
    This paper aims to share with the digital library community different opportunities to leverage crowdsourcing for a-posteriori capturing of dataset citation graphs. We describe a practical approach, which exploits one possible crowdsourcing technique to collect these graphs from domain experts and proposes their publication as Linked Data using the W3C PROV standard. Based on our findings from a study we ran during the USEWOD 2014 workshop, we propose a semi-automatic approach that generates metadata by leveraging information extraction as an additional step to crowdsourcing, to generate high-quality data citation graphs. Furthermore, we consider the design implications on our crowdsourcing approach when non-expert participants are involved in the process<br/

    A linked data approach to publishing complex scientific workflows

    Get PDF
    Past data management practices in many fields of natural science, including climate research, have focused primarily on the final research output - the research publication - with less attention paid to the chain of intermediate data results and their associated metadata, including provenance. Data were often regarded merely as an adjunct to the publication, rather than a scientific resource in their own right. In this paper, we attempt to address the issues of capturing and publishing detailed workflows associated with the climate/research datasets held by the Climatic Research Unit (CRU) at the University of East Anglia. To this end, we present a customisable approach to exposing climate research workflows for the effective re-use of the associated data, through the adoption of linked-data principles, existing widely adopted citation techniques (Digital Object Identifier) and data exchange mechanisms (Open Archives Initiative Object Reuse and Exchange)

    Science Quality and the Value of Inventions

    Get PDF
    Despite decades of research, the relationship between the quality of science and the value of inventions has remained unclear. We present the result of a large-scale matching exercise between 4.8 million patent families and 43 million publication records. We find a strong positive relationship between quality of scientific contributions referenced in patents and the value of the respective inventions. We rank patents by the quality of the science they are linked to. Strikingly, high-rank patents are twice as valuable as low-rank patents, which in turn are about as valuable as patents without direct science link. We show this core result for various science quality and patent value measures. The effect of science quality on patent value remains relevant even when science is linked indirectly through other patents. Our findings imply that what is considered "excellent" within the science sector also leads to outstanding outcomes in the technological or commercial realm.Comment: 44 page

    A Linked Data Approach to Sharing Workflows and Workflow Results

    No full text
    A bioinformatics analysis pipeline is often highly elaborate, due to the inherent complexity of biological systems and the variety and size of datasets. A digital equivalent of the ‘Materials and Methods’ section in wet laboratory publications would be highly beneficial to bioinformatics, for evaluating evidence and examining data across related experiments, while introducing the potential to find associated resources and integrate them as data and services. We present initial steps towards preserving bioinformatics ‘materials and methods’ by exploiting the workflow paradigm for capturing the design of a data analysis pipeline, and RDF to link the workflow, its component services, run-time provenance, and a personalized biological interpretation of the results. An example shows the reproduction of the unique graph of an analysis procedure, its results, provenance, and personal interpretation of a text mining experiment. It links data from Taverna, myExperiment.org, BioCatalogue.org, and ConceptWiki.org. The approach is relatively ‘light-weight’ and unobtrusive to bioinformatics users
    • 

    corecore