2,997 research outputs found
Provenance for computational tasks: a survey
Journal ArticleThe problem of systematically capturing and managing provenance for computational tasks has recently received significant attention because of its relevance to a wide range of domains and applications. The authors give an overview of important concepts related to provenance management, so that potential users can make informed decisions when selecting or designing a provenance solution
Recommended from our members
A primer on provenance
Better understanding data requires tracking its history and context.</jats:p
High-Fidelity Provenance:Exploring the Intersection of Provenance and Security
In the past 25 years, the World Wide Web has disrupted the way news are disseminated and consumed. However, the euphoria for the democratization of news publishing was soon followed by scepticism, as a new phenomenon emerged: fake news. With no gatekeepers to vouch for it, the veracity of the information served over the World Wide Web became a major public concern. The Reuters Digital News Report 2020 cites that in at least half of the EU member countries, 50% or more of the population is concerned about online fake news. To help address the problem of trust on information communi- cated over the World Wide Web, it has been proposed to also make available the provenance metadata of the information. Similar to artwork provenance, this would include a detailed track of how the information was created, updated and propagated to produce the result we read, as well as what agents—human or software—were involved in the process. However, keeping track of provenance information is a non-trivial task. Current approaches, are often of limited scope and may require modifying existing applications to also generate provenance information along with thei regular output. This thesis explores how provenance can be automatically tracked in an application-agnostic manner, without having to modify the individual applications. We frame provenance capture as a data flow analysis problem and explore the use of dynamic taint analysis in this context. Our work shows that this appoach improves on the quality of provenance captured compared to traditonal approaches, yielding what we term as high-fidelity provenance. We explore the performance cost of this approach and use deterministic record and replay to bring it down to a more practical level. Furthermore, we create and present the tooling necessary for the expanding the use of using deterministic record and replay for provenance analysis. The thesis concludes with an application of high-fidelity provenance as a tool for state-of-the art offensive security analysis, based on the intuition that software too can be misguided by "fake news". This demonstrates that the potential uses of high-fidelity provenance for security extend beyond traditional forensics analysis
ProvLight: Efficient Workflow Provenance Capture on the Edge-to-Cloud Continuum
Modern scientific workflows require hybrid infrastructures combining numerous
decentralized resources on the IoT/Edge interconnected to Cloud/HPC systems
(aka the Computing Continuum) to enable their optimized execution.
Understanding and optimizing the performance of such complex Edge-to-Cloud
workflows is challenging. Capturing the provenance of key performance
indicators, with their related data and processes, may assist in understanding
and optimizing workflow executions. However, the capture overhead can be
prohibitive, particularly in resource-constrained devices, such as the ones on
the IoT/Edge.To address this challenge, based on a performance analysis of
existing systems, we propose ProvLight, a tool to enable efficient provenance
capture on the IoT/Edge. We leverage simplified data models, data compression
and grouping, and lightweight transmission protocols to reduce overheads. We
further integrate ProvLight into the E2Clab framework to enable workflow
provenance capture across the Edge-to-Cloud Continuum. This integration makes
E2Clab a promising platform for the performance optimization of applications
through reproducible experiments.We validate ProvLight at a large scale with
synthetic workloads on 64 real-life IoT/Edge devices in the FIT IoT LAB
testbed. Evaluations show that ProvLight outperforms state-of-the-art systems
like ProvLake and DfAnalyzer in resource-constrained devices. ProvLight is 26
-- 37x faster to capture and transmit provenance data; uses 5 -- 7x less CPU;
2x less memory; transmits 2x less data; and consumes 2 -- 2.5x less energy.
ProvLight and E2Clab are available as open-source tools
Early survival and growth plasticity of 33 species planted in 38 arboreta across the European Atlantic area
To anticipate European climate scenarios for the end of the century, we explored the climate
gradient within the REINFFORCE (RÉseau INFrastructure de recherche pour le suivi et l’adaptation
des FORêts au Changement climatiquE) arboreta network, established in 38 sites between latitudes
37 and 57 , where 33 tree species are represented. We aim to determine which climatic variables
best explain their survival and growth, and identify those species that are more tolerant of climate
variation and those of which the growth and survival future climate might constrain. We used
empirical models to determine the best climatic predictor variables that explain tree survival and
growth. Precipitation-transfer distance was most important for the survival of broadleaved species,
whereas growing-season-degree days best explained conifer-tree survival. Growth (annual height
increment) was mainly explained by a derived annual dryness index (ADI) for both conifers and
broadleaved trees. Species that showed the greatest variation in survival and growth in response
to climatic variation included Betula pendula Roth, Pinus elliottii Engelm., and Thuja plicata Donn
ex D.Don, and those that were least affected included Quercus shumardii Buckland and Pinus nigra
J.F.Arnold. We also demonstrated that provenance differences were significant for Pinus pinea L., Quercus robur L., and Ceratonia siliqua L. Here, we demonstrate the usefulness of infrastructures along
a climatic gradient like REINFFORCE to determine major tendencies of tree species responding to
climate changesinfo:eu-repo/semantics/publishedVersio
- …