1,983 research outputs found
Causality and the semantics of provenance
Provenance, or information about the sources, derivation, custody or history
of data, has been studied recently in a number of contexts, including
databases, scientific workflows and the Semantic Web. Many provenance
mechanisms have been developed, motivated by informal notions such as
influence, dependence, explanation and causality. However, there has been
little study of whether these mechanisms formally satisfy appropriate policies
or even how to formalize relevant motivating concepts such as causality. We
contend that mathematical models of these concepts are needed to justify and
compare provenance techniques. In this paper we review a theory of causality
based on structural models that has been developed in artificial intelligence,
and describe work in progress on a causal semantics for provenance graphs.Comment: Workshop submissio
Quire: Lightweight Provenance for Smart Phone Operating Systems
Smartphone apps often run with full privileges to access the network and
sensitive local resources, making it difficult for remote systems to have any
trust in the provenance of network connections they receive. Even within the
phone, different apps with different privileges can communicate with one
another, allowing one app to trick another into improperly exercising its
privileges (a Confused Deputy attack). In Quire, we engineered two new security
mechanisms into Android to address these issues. First, we track the call chain
of IPCs, allowing an app the choice of operating with the diminished privileges
of its callers or to act explicitly on its own behalf. Second, a lightweight
signature scheme allows any app to create a signed statement that can be
verified anywhere inside the phone. Both of these mechanisms are reflected in
network RPCs, allowing remote systems visibility into the state of the phone
when an RPC is made. We demonstrate the usefulness of Quire with two example
applications. We built an advertising service, running distinctly from the app
which wants to display ads, which can validate clicks passed to it from its
host. We also built a payment service, allowing an app to issue a request which
the payment service validates with the user. An app cannot not forge a payment
request by directly connecting to the remote server, nor can the local payment
service tamper with the request
Towards Exascale Scientific Metadata Management
Advances in technology and computing hardware are enabling scientists from
all areas of science to produce massive amounts of data using large-scale
simulations or observational facilities. In this era of data deluge, effective
coordination between the data production and the analysis phases hinges on the
availability of metadata that describe the scientific datasets. Existing
workflow engines have been capturing a limited form of metadata to provide
provenance information about the identity and lineage of the data. However,
much of the data produced by simulations, experiments, and analyses still need
to be annotated manually in an ad hoc manner by domain scientists. Systematic
and transparent acquisition of rich metadata becomes a crucial prerequisite to
sustain and accelerate the pace of scientific innovation. Yet, ubiquitous and
domain-agnostic metadata management infrastructure that can meet the demands of
extreme-scale science is notable by its absence.
To address this gap in scientific data management research and practice, we
present our vision for an integrated approach that (1) automatically captures
and manipulates information-rich metadata while the data is being produced or
analyzed and (2) stores metadata within each dataset to permeate
metadata-oblivious processes and to query metadata through established and
standardized data access interfaces. We motivate the need for the proposed
integrated approach using applications from plasma physics, climate modeling
and neuroscience, and then discuss research challenges and possible solutions
- …