57 research outputs found

    To Trust or Not to Trust? Developing Trusted Digital Spaces through Timely Reliable and Personalized Provenance

    Get PDF
    Organizations are increasingly dependent on data stored and processed by distributed, heterogeneous services to make critical, high-value decisions. However, these service-oriented computing environments are dynamic in nature and are becoming ever more complex systems of systems. In such evolving and dynamic eco-system infrastructures, knowing how data was derived is of significant importance in determining its validity and reliability. To address this, a number of advocates and theorists postulate that provenance is critical to building trust in data and the services that generated it as it provides evidence for data consumers to judge the integrity of the results. This paper presents a summary of the STRAPP (trusted digital Spaces through Timely Reliable And Personalised Provenance) project, which is designing and engineering mechanisms to achieve a holistic solution to a number of real-world service-based decision-support systems

    Trust on the Web: Some Web Science Research Challenges

    No full text
    Web Science is the interdisciplinary study of the World Wide Web as a first-order object in order to understand its relationship with the wider societies in which it is embedded, and in order to facilitate its future engineering as a beneficial object. In this paper, research issues and challenges relating to the vital topic of trust are reviewed, showing how the Web Science agenda requires trust to be addressed, and how addressing the challenges requires a range of disciplinary skills applied in an integrated manner

    Taking “Data” (as a Topic): The Working Policies of Indifference, Purification and Differentiation

    Get PDF
    The recent surge of interest in e-science presents an opportune moment to re-examine the fundamental idea of “data”. This paper explores this topic by reporting on the different ways in which the idea of data is handled across many disciplines. From the accounts various disciplines themselves provide, these ways can be portrayed as the pursuit of three broad policies. The first policy is one of Indifference, which assumes the coherence of the data-concept, so that there is no need to explicate it further. The second policy is Purification, which identifies the essential characteristics of data according to the conventions of a particular discipline, with other modes systematically suppressed. The third policy allows for the Differentiation that is evident in the manifestations of data in various disciplines that utilise information systems. Greater appreciation among information professionals of the alternative approaches to data hopefully will enhance policy formulation and systems design

    Provenance-based Auditing of Private Data Use

    No full text
    Across the world, organizations are required to comply with regulatory frameworks dictating how to manage personal information. Despite these, several cases of data leaks and exposition of private data to unauthorized recipients have been publicly and widely advertised. For authorities and system administrators to check compliance to regulations, auditing of private data processing becomes crucial in IT systems. Finding the origin of some data, determining how some data is being used, checking that the processing of some data is compatible with the purpose for which the data was captured are typical functionality that an auditing capability should support, but difficult to implement in a reusable manner. Such questions are so-called provenance questions, where provenance is defined as the process that led to some data being produced. The aim of this paper is to articulate how data provenance can be used as the underpinning approach of an auditing capability in IT systems. We present a case study based on requirements of the Data Protection Act and an application that audits the processing of private data, which we apply to an example manipulating private data in a university

    The Quality and Veracity of Digital Data on Health: from Electronic Health Records to Big Data.

    Get PDF
    The quality of health information online depends on our ability to assess whether it is accurate, whether we are making this assessment as citizens/patients or whether we are using predictive software tools. There is a vast literature on the quality of health data online, and it suggests that the various tools for ensuring such quality are not fully adequate. I propose to address this problem by getting technological, organizational, and legal tools to work synergistically together. Integral to this vision―across all three elements―is the training needed for professionals delivering healthcare services as well as for patients using and generating health information online.La calidad de la información de salud que podemos encontrar en línea depende de nuestra capacidad para evaluar si ésta es precisa o no, de si estamos haciendo esta evaluación como ciudadanos/pacientes o de si estamos usando herramientas de software de predicción. Existe una amplia gama de literatura sobre la calidad de los datos de salud que podemos encontrar por internet, y ésta sugiere que las diversas herramientas para garantizar una alta calidad de la información no son totalmente adecuadas. Propongo abordar este problema obteniendo herramientas tecnológicas, organizativas y legales para trabajar juntos y generar sinergias.Integrada a esta visión, a través de los tres elementos, es necesaria la formación de los profesionales que prestan servicios de atención médica, así como de los pacientes que usan y generan información de salud en línea

    La calidad y veracidad de los datos digitales en salud: de la historia clínica a los datos masivos

    Get PDF
    La qualitat de la informació de salut que podem trobar en línia depèn de la nostra capacitat per avaluar si aquesta és precisa o no, de si estem fent aquesta avaluació com a ciutadans/pacients o de si estem fent servir eines de software de predicció. Existeix una àmplia gama de literatura sobre la qualitat de les dades de salut que podem trobar per internet, i aquesta suggereix que les diverses eines per garantir una alta qualitat de la informació no són totalment adequades. Proposo abordar aquest problema obtenint eines tecnològiques, organitzatives i legals per treballar junts i generar sinèrgies. Integrada a esta visió, a través dels tres elements, és necessària la formació dels professionals que presten serveis d’atenció mèdica, així com dels pacients que fan servir i generen informació de salut en línia.The quality of health information online depends on our ability to assess whether it is accurate, whether we are making this assessment as citizens/patients or whether we are using predictive software tools. There is a vast literature on the quality of health data online, and it suggests that the various tools for ensuring such quality are not fully adequate. I propose to address this problem by getting technological, organizational, and legal tools to work synergistically together. Integral to this vision―across all three elements―is the training needed for professionals delivering healthcare services as well as for patients using and generating health information online.La calidad de la información de salud que podemos encontrar en línea depende de nuestra capacidad para evaluar si ésta es precisa o no, de si estamos haciendo esta evaluación como ciudadanos/pacientes o de si estamos usando herramientas de software de predicción. Existe una amplia gama de literatura sobre la calidad de los datos de salud que podemos encontrar por internet, y ésta sugiere que las diversas herramientas para garantizar una alta calidad de la información no son totalmente adecuadas. Propongo abordar este problema obteniendo herramientas tecnológicas, organizativas y legales para trabajar juntos y generar sinergias. Integrada a esta visión, a través de los tres elementos, es necesaria la formación de los profesionales que prestan servicios de atención médica, así como de los pacientes que usan y generan información de salud en línea

    Modelling Knowledge about Software Processes using Provenance Graphs and its Application to Git-based Version Control Systems

    Get PDF
    Using the W3C PROV data model, we present a general provenance model for software development processes and, as an example, specialized models for git services, for which we generate provenance graphs. Provenance graphs are knowledge graphs, since they have defined semantics, and can be analyzed with graph algorithms or semantic reasoning to get insights into processes

    Linking to Scientific Data: Identity Problems of Unruly and Poorly Bounded Digital Objects

    Get PDF
    Within information systems, a significant aspect of search and retrieval across information objects, such as datasets, journal articles, or images, relies on the identity construction of the objects. This paper uses identity to refer to the qualities or characteristics of an information object that make it definable and recognizable, and can be used to distinguish it from other objects. Identity, in this context, can be seen as the foundation from which citations, metadata and identifiers are constructed. In recent years the idea of including datasets within the scientific record has been gaining significant momentum, with publishers, granting agencies and libraries engaging with the challenge. However, the task has been fraught with questions of best practice for establishing this infrastructure, especially in regards to how citations, metadata and identifiers should be constructed. These questions suggests a problem with how dataset identities are formed, such that an engagement with the definition of datasets as conceptual objects is warranted. This paper explores some of the ways in which scientific data is an unruly and poorly bounded object, and goes on to propose that in order for datasets to fulfill the roles expected for them, the following identity functions are essential for scholarly publications: (i) the dataset is constructed as a semantically and logically concrete object, (ii) the identity of the dataset is embedded, inherent and/or inseparable, (iii) the identity embodies a framework of authorship, rights and limitations, and (iv) the identity translates into an actionable mechanism for retrieval or reference
    corecore