303 research outputs found

    Assigning Creative Commons Licenses to Research Metadata: Issues and Cases

    Get PDF
    This paper discusses the problem of lack of clear licensing and transparency of usage terms and conditions for research metadata. Making research data connected, discoverable and reusable are the key enablers of the new data revolution in research. We discuss how the lack of transparency hinders discovery of research data and make it disconnected from the publication and other trusted research outcomes. In addition, we discuss the application of Creative Commons licenses for research metadata, and provide some examples of the applicability of this approach to internationally known data infrastructures.Comment: 9 pages. Submitted to the 29th International Conference on Legal Knowledge and Information Systems (JURIX 2016), Nice (France) 14-16 December 201

    Mapping Large Scale Research Metadata to Linked Data: A Performance Comparison of HBase, CSV and XML

    Full text link
    OpenAIRE, the Open Access Infrastructure for Research in Europe, comprises a database of all EC FP7 and H2020 funded research projects, including metadata of their results (publications and datasets). These data are stored in an HBase NoSQL database, post-processed, and exposed as HTML for human consumption, and as XML through a web service interface. As an intermediate format to facilitate statistical computations, CSV is generated internally. To interlink the OpenAIRE data with related data on the Web, we aim at exporting them as Linked Open Data (LOD). The LOD export is required to integrate into the overall data processing workflow, where derived data are regenerated from the base data every day. We thus faced the challenge of identifying the best-performing conversion approach.We evaluated the performances of creating LOD by a MapReduce job on top of HBase, by mapping the intermediate CSV files, and by mapping the XML output.Comment: Accepted in 0th Metadata and Semantics Research Conferenc

    SPECIAL TRACK: Present and future of research metadata: where do we want to go from here?

    Get PDF
    To steer the scientific system towards specific goals, it is first necessary to develop an effective understanding of all phases and aspects of the research workflow. Research metadata, as the collective record of traces that are generated when scientific activities take place, serves as evidence of these activities. Therefore, the availability of authoritative research metadata is essential for science-related decision-making at various levels. In the past, large-scale research metadata collections mostly dealt with items in the public record, such as bibliographic metadata about academic publications. There used to be few of these large-scale metadata collections, and they were often provided by commercial actors, those which invested the necessary resources to compile and process disperse public information with the goal of turning it into usable services. As the capabilities of available technologies increase, each day more sectors of the scientific system are becoming aware of how their activities could benefit from updating their workflows, a process often referred to as digital transformation. Thus, a plethora of tools and standards are being developed to streamline processes, increase interoperability, and in general overcome the limitations of the paper era. This is having a large effect in the quantity and quality of research metadata that is now being recorded. A clear example of the above is the case of bibliographic metadata. Currently, an increasing number of organizations, spurred by the decreasing barriers to collecting and processing large amounts of bibliographic metadata, are already providing services and datasets that rival the offerings of the traditional commercial providers. Some of these new datasets, provided under open licenses that allow unrestricted reuse and redistribution, have boosted innovation by allowing the development of downstream applications that rely on these metadata collections. However, as scientific activities in general and scientific communication in particular are increasingly moving to the digital space, traditional bibliographic metadata is no longer the only kind of research metadata that is being collected and processed at a large scale to inform decisions. Social network platforms now capture a portion of academic-related conversations and other kinds of interactions. Processes such as peer review that were previously carried out behind closed doors are now being opened, generating their own public trace. Publishing platforms are implementing increasingly sophisticated methods to track and mine user actions for their benefit. All these recent developments call for a discussion on the role of research metadata in the scientific system going forward. This discussion should be open to a large variety of stakeholders, including data providers, scientometricians, academic librarians, higher education institutions, policy managers, and developers of downstream applications. The topics of the contributions to this special track can include: • Analyses of the suitability of research metadata sources for specific use cases • Sustainability and governance of research metadata • Innovations in research metadata • Downstream applications of open research metadata • Surveillance through research metadata Contributions to this special track would be open to everyone interested and peer-reviewed. The format of the session would be 15-20 minutes per presentation, with time for questions after each presentation

    A Query Integrator and Manager for the Query Web

    Get PDF
    We introduce two concepts: the Query Web as a layer of interconnected queries over the document web and the semantic web, and a Query Web Integrator and Manager (QI) that enables the Query Web to evolve. QI permits users to write, save and reuse queries over any web accessible source, including other queries saved in other installations of QI. The saved queries may be in any language (e.g. SPARQL, XQuery); the only condition for interconnection is that the queries return their results in some form of XML. This condition allows queries to chain off each other, and to be written in whatever language is appropriate for the task. We illustrate the potential use of QI for several biomedical use cases, including ontology view generation using a combination of graph-based and logical approaches, value set generation for clinical data management, image annotation using terminology obtained from an ontology web service, ontology-driven brain imaging data integration, small-scale clinical data integration, and wider-scale clinical data integration. Such use cases illustrate the current range of applications of QI and lead us to speculate about the potential evolution from smaller groups of interconnected queries into a larger query network that layers over the document and semantic web. The resulting Query Web could greatly aid researchers and others who now have to manually navigate through multiple information sources in order to answer specific questions

    Digitometric Services for Open Archives Environments

    No full text
    We describe “digitometric” services and tools that add value to open-access eprint archives using the Open Archives Initiative (OAI) Protocol for Metadata Harvesting. Celestial is an OAI cache and gateway tool. Citebase Search enhances OAI-harvested metadata with linked references harvested from the full-text to provide a web service for citation navigation and research impact analysis. Digitometrics builds on data harvested using OAI to provide advanced visualisation and hypertext navigation for the research community. Together these services provide a modular, distributed architecture for building a “semantic web” for the research literature

    Modeling the semantics of contextual and content-specific research metadata using ontology languages: issues on combining CERIF and OWL

    Get PDF
    Current Research Information Systems (CRISs) enable the maintenance of information related to research activities of organizations and their members, including outputs or products from these activities. Such contextual information is of uttermost importance for the processing of datasets and with the retrieval of scientific documents, providing e.g. the key information on provenance and characteristics of research activities that are needed when searching for data or scholarly content. In the context of the expanding initiative of the Web of Linked Data, translating that information into semantic languages enables new ways of querying benefitting from the reuse of domain ontologies. In that direction, this paper reports on the engineering of an ontology based version of the CERIF standard for CRISs using the OWL language and a proposed mapping to research datasets

    Modeling the semantics of contextual and content-specific research metadata using ontology languages: issues on combining CERIF and OWL

    Get PDF
    Current Research Information Systems (CRISs) enable the maintenance of information related to research activities of organizations and their members, including outputs or products from these activities. Such contextual information is of uttermost importance for the processing of datasets and with the retrieval of scientific documents, providing e.g. the key information on provenance and characteristics of research activities that are needed when searching for data or scholarly content. In the context of the expanding initiative of the Web of Linked Data, translating that information into semantic languages enables new ways of querying benefitting from the reuse of domain ontologies. In that direction, this paper reports on the engineering of an ontology based version of the CERIF standard for CRISs using the OWL language and a proposed mapping to research datasets
    • …
    corecore