29,559 research outputs found

    Estimating The Quality Of Data Using Provenance: A Case Study In Escience

    Get PDF
    Data quality assessment is a key factor in data-intensive domains. The data deluge is aggravated by an increasing need for interoperability and cooperation across groups and organizations. New alternatives must be found to select the data that best satisfy users' needs in a given context. This paper presents a strategy to provide information to support the evaluation of the quality of data sets. This strategy is based on combining metadata on the provenance of a data set (derived from workflows that generate it) and quality dimensions defined by the set's users, based on the desired context of use. Our solution, validated via a case study, takes advantage of a semantic model to preserve data provenance related to applications in a specific domain. © (2013) by the AIS/ICIS Administrative Office All rights reserved.214421451IBM,SAP University Alliances,Microsoft,DePaul University,Georgia State University - J. Mack Robinson College of Business,et alBallou, D., Modeling Information Manufacturing Systems to Determine Information Product Quality (1998) Manage. Sci, 44, pp. 462-484Barga, R.S., Digiampietri, L.A., Automatic capture and efficient storage of e-Science experiment provenance (2008) Concurr. Comput.□: Pract. Exper, 20 (5), pp. 419-429Batini, C., Scannapieco, M., (2006) Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications), , Springer-VerlagBlake, R., Mangiameli, P., The Effects and Interactions of Data Quality and Problem Complexity on Classification (2011) Journal of Data and Information Quality, 2 (2), pp. 1-28Chapman, A.D., (2005) Principles of Data Quality, , Global Biodiversity Information Facility, CopenhagenChen, P., Plale, B., Aktas, M.S., Temporal Representation for Scientific Data Provenance (2012) In Proc. 8th IEEE Int. Conf. On EScience 2012Cugler, D.C., Medeiros, C.B., Toledo, F., An architecture for retrieval of animal sound recordings based on context variables (2012) Concurrency and Computation - Practice and ExperienceDavies, J., Studer, R., Warren, P., (2006) Semantic Web Technologies: Trends and Research In Ontology-based Systems, , Wiley(2010) The Dublin Core Metadata Initiative, , http://dublincore.org/, DCMI, Available atDeVries, P.J., (2009) GeoSpecies Ontology, , http://bioportal.bioontology.org/ontologies/1247, Available at(2009) Darwin Core Task Group, , http://www.tdwg.org/standards/450/, DwC, Available atGoodchild, M.F., Li, L., Assuring the quality of volunteered geographic information (2012) Spatial Statistics, 1, pp. 110-120Hartig, O., Zhao, J., Using web data provenance for quality assessment (2009) In Proc. of the Workshop On Semantic Web and Provenance Management At ISWC(2011) The Kepler Project, , https://kepler-project.org/, Kepler, Available atKondo, A.A., Traceability in Food for Supply Chains (2007) In Proc. 3rd Int. Conf. On Web Information Systems and Technologies (WEBIST), pp. 121-127. , INSTICCLassila, O., Swick, R.R., (1999) Resource Description Framework (RDF) Model and Syntax SpecificationMalaverri, J.E.G., Medeiros, C.B., A Provenance-based Approach to Evaluate Data Quality in eScience (2013) Int. J. Metadata, Semantics and Ontology - Special Issue On Metadata For E-science and E-researchMoreau, L., The Open Provenance Model core specification (v1.1) (2011) Future Generation Comp. Syst, 27 (6), pp. 743-756Parssian, A., Managerial decision support with knowledge of accuracy and completeness of the relational aggregate functions (2006) Decis. Support Syst, 42, pp. 1494-1502Pernici, B., Scannapieco, M., Data Quality in Web Information Systems (2002) In Proc. of the 21st Int. Conf. On Conceptual Modeling, pp. 397-413. , Springer-VerlagPipino, L.L., Lee, Y.W., Wang, R.Y., Data Quality Assessment (2002) Commun. ACM, 45, pp. 211-218Prat, N., Madnick, S., Measuring Data Believability: A Provenance Approach (2008) Proc. of the 41st Hawaii Int. Conf. On System Sciences, p. 393Richard, Y., Diane, M., Beyond accuracy□: What data quality means to data consumers (1996) Journal of ManagementSahoo, S.S., Sheth, A.P., Henson, C.A., Semantic Provenance for eScience: Managing the Deluge of Scientific Data (2008) IEEE Internet Computing, 12 (4), pp. 46-54Simmhan, Y., Plale, B., Using Provenance for Personalized Quality Ranking of Scientific Datasets (2011) I. J. Comput. Appl, 18 (3), pp. 180-195(2009) The Taverna Project, , http://www.taverna.org.uk/, Taverna, Available at(2011) The VisTrails Project, , http://www.vistrails.org, VisTrails, Available at(2012) The PROV Ontology, , http://www.w3.org/TR/prov-o/, W3C, Available atWang, X., Gorlitsky, R., Almeida, J.S., From XML to RDF: How semantic web technologies will change the design of omic standards (2005) Nat Biotech, 23 (9), pp. 1099-1103Yeganeh, S.H., Hassanzadeh, O., Miller, R.J., Linking Semistructured Data on the Web (2011) In Proc. 14th Int. Workshop On the Web and DatabasesZhao, J., Mining Taverna's semantic web of provenance (2008) Concurr. Comput.□: Pract. Exper, 20, pp. 463-47

    QUAL : A Provenance-Aware Quality Model

    Get PDF
    The research described here is supported by the award made by the RCUK Digital Economy program to the dot.rural Digital Economy Hub; award reference: EP/G066051/1.Peer reviewedPostprin

    Assessing the Quality of Semantic Sensor Data

    Get PDF
    Acknowledgements The research described here is supported by the award made by the RCUK Digital Economy programme to the dot.rural Digital Economy Hub; award reference: EP/G066051/1.Publisher PD

    Managing the Provenance of Crowdsourced Disruption Reports

    Get PDF
    A paid open access option is available for this journal. Authors own final version only can be archived Publisher's version/PDF cannot be used On author's website immediately On any open access repository after 12 months from publication Published source must be acknowledged Must link to publisher version Set phrase to accompany link to published version (see policy) Articles in some journals can be made Open Access on payment of additional chargePublisher PD

    Enhanced Trustworthy and High-Quality Information Retrieval System for Web Search Engines

    Get PDF
    The WWW is the most important source of information. But, there is no guarantee for information correctness and lots of conflicting information is retrieved by the search engines and the quality of provided information also varies from low quality to high quality. We provide enhanced trustworthiness in both specific (entity) and broad (content) queries in web searching. The filtering of trustworthiness is based on 5 factors – Provenance, Authority, Age, Popularity, and Related Links. The trustworthiness is calculated based on these 5 factors and it is stored thereby increasing the performance in retrieving trustworthy websites. The calculated trustworthiness is stored only for static websites. Quality is provided based on policies selected by the user. Quality based ranking of retrieved trusted information is provided using WIQA (Web Information Quality Assessment) Framework

    Trust and Risk Relationship Analysis on a Workflow Basis: A Use Case

    Get PDF
    Trust and risk are often seen in proportion to each other; as such, high trust may induce low risk and vice versa. However, recent research argues that trust and risk relationship is implicit rather than proportional. Considering that trust and risk are implicit, this paper proposes for the first time a novel approach to view trust and risk on a basis of a W3C PROV provenance data model applied in a healthcare domain. We argue that high trust in healthcare domain can be placed in data despite of its high risk, and low trust data can have low risk depending on data quality attributes and its provenance. This is demonstrated by our trust and risk models applied to the BII case study data. The proposed theoretical approach first calculates risk values at each workflow step considering PROV concepts and second, aggregates the final risk score for the whole provenance chain. Different from risk model, trust of a workflow is derived by applying DS/AHP method. The results prove our assumption that trust and risk relationship is implicit

    Provenance in Linked Data Integration

    No full text
    The open world of the (Semantic) Web is a global information space offering diverse materials of disparate qualities, and the opportunity to re-use, aggregate, and integrate these materials in novel ways. The advent of Linked Data brings the potential to expose data on the Web, creating new challenges for data consumers who want to integrate these data. One challenge is the ability, for users, to elicit the reliability and/or the accuracy of the data they come across. In this paper, we describe a light-weight provenance extension for the voiD vocabulary that allows data publishers to add provenance metadata to their datasets. These provenance metadata can be queried by consumers and used as contextual information for integration and inter-operation of information resources on the Semantic Web

    Utilising Provenance to Enhance Social Computation

    Get PDF
    Postprin
    • …
    corecore