13,448 research outputs found

    The lifecycle of provenance metadata and its associated challenges and opportunities

    Full text link
    This chapter outlines some of the challenges and opportunities associated with adopting provenance principles and standards in a variety of disciplines, including data publication and reuse, and information sciences

    Linked Data - the story so far

    No full text
    The term ā€œLinked Dataā€ refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertionsā€” the Web of Data. In this article, the authors present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. They describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward

    Recording accurate process documentation in the presence of failures

    No full text
    Scientific and business communities present unprecedented requirements on provenance, where the provenance of some data item is the process that led to that data item. Previous work has conceived a computer-based representation of past executions for determining provenance, termed process documentation, and has developed a protocol, PReP, to record process documentation in service oriented architectures. However, PReP assumes a failure free environment. The presence of failures may lead to inaccurate process documentation, which does not reflect reality and hence cannot be trustful and utilized. This paper outlines our solution, F-PReP, a protocol for recording accurate process documentation in the presence of failures

    Research Objects: Towards Exchange and Reuse of Digital Knowledge

    Get PDF
    What will researchers be publishing in the future? Whilst there is little question that the Web will be the publication platform, as scholars move away from paper towards digital content, there is a need for mechanisms that support the production of self-contained units of knowledge and facilitate the publication, sharing and reuse of such entities.

 In this paper we discuss the notion of _research objects_, semantically rich aggregations of resources, that can possess some scientific intent or support some research objective. We present a number of principles that we expect such objects and their associated services to follow

    Chemical information matters: an e-Research perspective on information and data sharing in the chemical sciences

    No full text
    Recently, a number of organisations have called for open access to scientific information and especially to the data obtained from publicly funded research, among which the Royal Society report and the European Commission press release are particularly notable. It has long been accepted that building research on the foundations laid by other scientists is both effective and efficient. Regrettably, some disciplines, chemistry being one, have been slow to recognise the value of sharing and have thus been reluctant to curate their data and information in preparation for exchanging it. The very significant increases in both the volume and the complexity of the datasets produced has encouraged the expansion of e-Research, and stimulated the development of methodologies for managing, organising, and analysing "big data". We review the evolution of cheminformatics, the amalgam of chemistry, computer science, and information technology, and assess the wider e-Science and e-Research perspective. Chemical information does matter, as do matters of communicating data and collaborating with data. For chemistry, unique identifiers, structure representations, and property descriptors are essential to the activities of sharing and exchange. Open science entails the sharing of more than mere facts: for example, the publication of negative outcomes can facilitate better understanding of which synthetic routes to choose, an aspiration of the Dial-a-Molecule Grand Challenge. The protagonists of open notebook science go even further and exchange their thoughts and plans. We consider the concepts of preservation, curation, provenance, discovery, and access in the context of the research lifecycle, and then focus on the role of metadata, particularly the ontologies on which the emerging chemical Semantic Web will depend. Among our conclusions, we present our choice of the "grand challenges" for the preservation and sharing of chemical information

    Provenance-based Auditing of Private Data Use

    No full text
    Across the world, organizations are required to comply with regulatory frameworks dictating how to manage personal information. Despite these, several cases of data leaks and exposition of private data to unauthorized recipients have been publicly and widely advertised. For authorities and system administrators to check compliance to regulations, auditing of private data processing becomes crucial in IT systems. Finding the origin of some data, determining how some data is being used, checking that the processing of some data is compatible with the purpose for which the data was captured are typical functionality that an auditing capability should support, but difficult to implement in a reusable manner. Such questions are so-called provenance questions, where provenance is defined as the process that led to some data being produced. The aim of this paper is to articulate how data provenance can be used as the underpinning approach of an auditing capability in IT systems. We present a case study based on requirements of the Data Protection Act and an application that audits the processing of private data, which we apply to an example manipulating private data in a university

    Collaboratively Patching Linked Data

    Full text link
    Today's Web of Data is noisy. Linked Data often needs extensive preprocessing to enable efficient use of heterogeneous resources. While consistent and valid data provides the key to efficient data processing and aggregation we are facing two main challenges: (1st) Identification of erroneous facts and tracking their origins in dynamically connected datasets is a difficult task, and (2nd) efforts in the curation of deficient facts in Linked Data are exchanged rather rarely. Since erroneous data often is duplicated and (re-)distributed by mashup applications it is not only the responsibility of a few original publishers to keep their data tidy, but progresses to be a mission for all distributers and consumers of Linked Data too. We present a new approach to expose and to reuse patches on erroneous data to enhance and to add quality information to the Web of Data. The feasibility of our approach is demonstrated by example of a collaborative game that patches statements in DBpedia data and provides notifications for relevant changes.Comment: 2nd International Workshop on Usage Analysis and the Web of Data (USEWOD2012) in the 21st International World Wide Web Conference (WWW2012), Lyon, France, April 17th, 201
    • ā€¦
    corecore