Search CORE

47,138 research outputs found

Recommended from our members

Provenance as First Class Cloud Data

Author: Muniswamy-Reddy Kiran-Kumar
Seltzer Margo I.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/11/2011
Field of study

Digital provenance is meta-data that describes the ancestry or history of a digital object. Most work on provenance focuses on how provenance increases the value of data to consumers. However, provenance is also valuable to storage providers. For example, provenance can provide hints on access patterns, detect anomalous behavior, and provide enhanced user search capabilities. As the next generation storage providers, cloud vendors are in the unique position to capitalize on this opportunity to incorporate provenance as a fundamental storage system primitive. To date, cloud offerings have not yet done so. We provide motivation for providers to treat provenance as first class data in the cloud and based on our experience with provenance in a local storage system, suggest a set of requirements that make provenance feasible and attractive.Engineering and Applied Science

Harvard University - DASH

Technical Note: Comparison of storage strategies of sea surface microlayer samples

Author: Mann Paul
Salter Matthew
Schneider-Zapp Klaus
Upstill-Goddard Robert
Publication venue: 'Copernicus GmbH'
Publication date: 01/07/2013
Field of study

The sea surface microlayer (SML) is an important biogeochemical system whose physico-chemical analysis often necessitates some degree of sample storage. However, many SML components degrade with time so the development of optimal storage protocols is paramount. We here briefly review some commonly used treatment and storage protocols. Using freshwater and saline SML samples from a river estuary, we investigated temporal changes in surfactant activity (SA) and the absorbance and fluorescence of chromophoric dissolved organic matter (CDOM) over four weeks, following selected sample treatment and storage protocols. Some variability in the effectiveness of individual protocols most likely reflects sample provenance. None of the various protocols examined performed any better than dark storage at 4 °C without pre-treatment. We therefore recommend storing samples refrigerated in the dark

Northumbria Research Link

Crossref

Directory of Open Access Journals

Distributed storage and queryng techniques for a semantic web of scientific workflow provenance

Author: Navarro Jaime Alberto
Publication venue: ScholarWorks @ UTRGV
Publication date: 01/08/2010
Field of study

In scientific workflow environments, scientists depend on provenance, which records the history of an experiment. Resource Description Framework is frequently used to represent provenance based on vocabularies such as the Open Provenance Model. For complex scientific workflows that generate large amounts of RDF triples, single-machine provenance management becomes inadequate over time. In this thesis, we research how HBase capabilities can be leveraged for distributed storage and querying of provenance data represented in RDF. We architect the ProvBase system that incorporates an HBase/Hadoop backend, propose a storage schema to hold provenance triples, and design querying algorithms to evaluate SPARQL queries in the system. We conduct an experimental study to show the feasibility of our approach

Scholarworks@UTRGV Univ. of Texas RioGrande Valley

Recommended from our members

Layering in Provenance Systems

Author: Braun Uri Jacob
Holland David A
Macko Peter
Maclean Diana
Margo Daniel Wyatt
Muniswamy-Reddy Kiran-Kumar
Seltzer Margo I.
Smogor Robin
Publication venue: USENIX Association
Publication date: 06/10/2011
Field of study

Digital provenance describes the ancestry or history of a digital object. Most existing provenance systems, however, operate at only one level of abstraction: the sys- tem call layer, a workflow specification, or the high-level constructs of a particular application. The provenance collectable in each of these layers is different, and all of it can be important. Single-layer systems fail to account for the different levels of abstraction at which users need to reason about their data and processes. These systems cannot integrate data provenance across layers and cannot answer questions that require an integrated view of the provenance. We have designed a provenance collection structure facilitating the integration of provenance across multiple levels of abstraction, including a workflow engine, a web browser, and an initial runtime Python provenance tracking wrapper. We layer these components atop provenance-aware network storage (NFS) that builds upon a Provenance-Aware Storage System (PASS). We discuss the challenges of building systems that integrate provenance across multiple layers of abstraction, present how we augmented systems in each layer to integrate provenance, and present use cases that demonstrate how provenance spanning multiple layers provides functionality not available in existing systems. Our evaluation shows that the overheads imposed by layering provenance systems are reasonable.Engineering and Applied Science

Harvard University - DASH

Provenance in scientific workflow systems

Author: Davidson Susan
Freire Juliana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Journal ArticleThe automated tracking and storage of provenance information promises to be a major advantage of scientific workflow systems. We discuss issues related to data and workflow provenance, and present techniques for focusing user attention on meaningful provenance through "user views," for managing the provenance of nested scientific data, and for using information about the evolution of a workflow specification to understand the difference in the provenance of similar data products

The University of Utah: J. Willard Marriott Digital Library

Presentation Panel on Management and Storage

Author: Boyer Douglas Martin
Publication venue: Iowa Research Online
Publication date: 05/02/2018
Field of study

What data should be kept over the long term? How does one track digital provenance? Should we track digital provenance? What constitutes a master/archival copy? What options are being evaluated for storage of large data sets “in perpetuity”? Cost/benefit analysis for storage solutions? What challenges are repositories facing with this data type? http://www.dpconline.org/handbook/digital-preservation/why-digital-preservation-matter

Iowa Research Online

Provenance-Aware Sensor Data Storage

Author: Braun Uri
Holland David A.
Ledlie Jonathan
Muniswamy-Reddy Kiran-Kumar
Ng Chaki
Seltzer Margo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Sensor network data has both historical and realtime value. Making historical sensor data useful, in particular, requires storage, naming, and indexing. Sensor data presents new challenges in these areas. Such data is location-specific but also distributed; it is collected in a particular physical location and may be most useful there, but it has additional value when combined with other sensor data collections in a larger distributed system. Thus, arranging location-sensitive peer-to-peer storage is one challenge. Sensor data sets do not have obvious names, so naming them in a globally useful fashion is another challenge. The last challenge arises from the need to index these sensor data sets to make them searchable. The key to sensor data identity is provenance, the full history or lineage of the data. We show how provenance addresses the naming and indexing issues and then present a research agenda for constructing distributed, indexed repositories of sensor data.Engineering and Applied Science

CiteSeerX

Harvard University - DASH