13 research outputs found
Recommended from our members
Provenance-based computing
Relying on computing systems that become increasingly complex is difficult:
with many factors potentially affecting the result of a computation or its
properties, understanding where problems appear and fixing them is a
challenging proposition. Typically, the process of finding solutions is driven
by trial and error or by experience-based insights.
In this dissertation, I examine the idea of using provenance metadata (the set
of elements that have contributed to the existence of a piece of data, together
with their relationships) instead. I show that considering provenance a
primitive of computation enables the exploration of system behaviour, targeting
both retrospective analysis (root cause analysis, performance tuning) and
hypothetical scenarios (what-if questions). In this context, provenance can be
used as part of feedback loops, with a double purpose: building software that
is able to adapt for meeting certain quality and performance targets
(semi-automated tuning) and enabling human operators to exert high-level
runtime control with limited previous knowledge of a system's internal architecture.
My contributions towards this goal are threefold: providing low-level
mechanisms for meaningful provenance collection considering OS-level resource
multiplexing, proving that such provenance data can be used in inferences about
application behaviour and generalising this to a set of primitives necessary for
fine-grained provenance disclosure in a wider context.
To derive such primitives in a bottom-up manner, I first present Resourceful, a
framework that enables capturing OS-level measurements in the context of
application activities. It is the contextualisation that allows tying the
measurements to provenance in a meaningful way, and I look at a number of
use-cases in understanding application performance. This also provides a good
setup for evaluating the impact and overheads of fine-grained provenance
collection.
I then show that the collected data enables new ways of understanding
performance variation by attributing it to specific components within a
system. The resulting set of tools, Soroban, gives developers and operation
engineers a principled way of examining the impact of various configuration, OS and virtualization parameters on application behaviour.
Finally, I consider how this supports the idea that provenance should be
disclosed at application level and discuss why such disclosure is necessary for
enabling the use of collected metadata efficiently and at a granularity which
is meaningful in relation to application semantics.CHESS Scholarship Scheme
EPSR
ProvMark:A Provenance Expressiveness Benchmarking System
System level provenance is of widespread interest for applications such as
security enforcement and information protection. However, testing the
correctness or completeness of provenance capture tools is challenging and
currently done manually. In some cases there is not even a clear consensus
about what behavior is correct. We present an automated tool, ProvMark, that
uses an existing provenance system as a black box and reliably identifies the
provenance graph structure recorded for a given activity, by a reduction to
subgraph isomorphism problems handled by an external solver. ProvMark is a
beginning step in the much needed area of testing and comparing the
expressiveness of provenance systems. We demonstrate ProvMark's usefuless in
comparing three capture systems with different architectures and distinct
design philosophies.Comment: To appear, Middleware 201
Improving the Visualization of Electron-Microscopy Data Through Optical Flow Interpolation
Technical developments in neurobiology have reached a point where the acquisition of high resolution images representing individual neurons and synapses becomes possible. For this, the brain tissue samples are sliced using a diamond knife and imaged with electron-microscopy (EM). However, the technique achieves a low resolution in the cutting direction, due to limitations of the mechanical process, making a direct visualization of a dataset difficult. We aim to increase the depth resolution of the volume by adding new image slices interpolated from the existing ones, without requiring modifications to the EM image-capturing method. As classical interpolation methods do not provide satisfactory results on this type of data, the current paper proposes a re-framing of the problem in terms of motion volumes, considering the depth axis as a temporal axis. An optical flow method is adapted to estimate the motion vectors of pixels in the EM images, and this information is used to compute and insert multiple new images at certain depths in the volume. We evaluate the visualization results in comparison with interpolation methods currently used on EM data, transforming the highly anisotropic original dataset into a dataset with a larger depth resolution. The interpolation based on optical flow better reveals neurite structures with realistic undistorted shapes, and helps to easier map neuronal connections
IPAPI: Designing an Improved Provenance API
Abstract We investigate the main limitations imposed by existing provenance systems in the development of provenanceaware applications. In the case of disclosed provenance APIs, most of those limitations can be traced back to the inability to integrate provenance from different sources, layers and of different granularities into a coherent view of data production. We consider possible solutions in the design of an Improved Provenance API (IPAPI), based on a general model of how different system entities interact to generate, accumulate or propagate provenance. The resulting architecture enables a whole new range of provenance capture scenarios, for which available APIs do not provide adequate support
Research data supporting: "Shadow Kernels: A General Mechanism For Kernel Specialization in Existing Operating Systems"
Research data supporting "Shadow Kernels: A General Mechanism For Kernel Specialization in Existing Operating Systems".This work was supported by the EPSRC [grant number EP/K503009/1]
Recommended from our members
Research data supporting "Soroban: Attributing Latency in Virtualized Environments"
This is data required for reproducing the figures in the "Soroban: Attributing Latency in Virtualized Environments" paper. It
consists of two archives:
[rscfl-exp.zip] containing (i) raw data from kernel level measurements of lighttpd serving requests while running in a virtualized environment (in json format, the .dat and .mdat metadata files) (ii) post-processed measurements with the data filtered and added into python pandas.DataFrame objects for querying and plotting (in standard binary python pickle serialization format, .sdat files) (iii) attribution model files containing serializations of python sklearn.gaussian_process.GaussianProcess objects after training (in standard binary python pickle serialization format, .pickle files) (iv) python scripts for processing and plotting the data (.py files)
AND [rscfl-repr.zip] containing (i) experiment definition files for figures in the paper, containing the information required to reproduce them (.json format) (ii) python script for reproducing experiments (rscfl_exp.py)Additional information: As an example, for reproducing Figure 3 from the paper, one needs to unpack the two archives (rscfl-repr and rscfl-exp) and run
$ ./rscfl_exp.py -c fig3.config.json -s /path/to/rscfl_exp --fg_load_vm localhostEPSRC EP/K503009/