Search CORE

13 research outputs found

Recommended from our members

Provenance-based computing

Author: Carata Lucian
Publication venue: University of Cambridge
Publication date: 17/12/2018
Field of study

Relying on computing systems that become increasingly complex is difficult: with many factors potentially affecting the result of a computation or its properties, understanding where problems appear and fixing them is a challenging proposition. Typically, the process of finding solutions is driven by trial and error or by experience-based insights. In this dissertation, I examine the idea of using provenance metadata (the set of elements that have contributed to the existence of a piece of data, together with their relationships) instead. I show that considering provenance a primitive of computation enables the exploration of system behaviour, targeting both retrospective analysis (root cause analysis, performance tuning) and hypothetical scenarios (what-if questions). In this context, provenance can be used as part of feedback loops, with a double purpose: building software that is able to adapt for meeting certain quality and performance targets (semi-automated tuning) and enabling human operators to exert high-level runtime control with limited previous knowledge of a system's internal architecture. My contributions towards this goal are threefold: providing low-level mechanisms for meaningful provenance collection considering OS-level resource multiplexing, proving that such provenance data can be used in inferences about application behaviour and generalising this to a set of primitives necessary for fine-grained provenance disclosure in a wider context. To derive such primitives in a bottom-up manner, I first present Resourceful, a framework that enables capturing OS-level measurements in the context of application activities. It is the contextualisation that allows tying the measurements to provenance in a meaningful way, and I look at a number of use-cases in understanding application performance. This also provides a good setup for evaluating the impact and overheads of fine-grained provenance collection. I then show that the collected data enables new ways of understanding performance variation by attributing it to specific components within a system. The resulting set of tools, Soroban, gives developers and operation engineers a principled way of examining the impact of various configuration, OS and virtualization parameters on application behaviour. Finally, I consider how this supports the idea that provenance should be disclosed at application level and discuss why such disclosure is necessary for enabling the use of collected metadata efficiently and at a granularity which is meaningful in relation to application semantics.CHESS Scholarship Scheme EPSR

Apollo (Cambridge)

To Tune or Not to Tune? In Search of Optimal Configurations for Data Analytics

Author: Carata Lucian
Fekry Ayat
Hopper Andy
Pasquier Thomas
Rice Andrew
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2020
Field of study

Crossref

Explore Bristol Research

Towards Seamless Configuration Tuning of Big Data Analytics

Author: Carata Lucian
Fekry Ayat
Hopper Andy
Pasquier Thomas
Rice Andrew
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Explore Bristol Research

ProvMark:A Provenance Expressiveness Benchmarking System

Author: Bhatotia Pramod
Carata Lucian
Chan Sheung Chi
Cheney James
Gehani Ashish
Irshad Hassaan
Pasquier Thomas
Seltzer Margo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/09/2019
Field of study

System level provenance is of widespread interest for applications such as security enforcement and information protection. However, testing the correctness or completeness of provenance capture tools is challenging and currently done manually. In some cases there is not even a clear consensus about what behavior is correct. We present an automated tool, ProvMark, that uses an existing provenance system as a black box and reliably identifies the provenance graph structure recorded for a given activity, by a reduction to subgraph isomorphism problems handled by an external solver. ProvMark is a beginning step in the much needed area of testing and comparing the expressiveness of provenance systems. We demonstrate ProvMark's usefuless in comparing three capture systems with different architectures and distinct design philosophies.Comment: To appear, Middleware 201

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Explore Bristol Research

Improving the Visualization of Electron-Microscopy Data Through Optical Flow Interpolation

Author: Dan Shao
Eduard Groeller
Lucian Carata
Markus Hadwiger
Publication venue
Publication date: 09/06/2012
Field of study

Technical developments in neurobiology have reached a point where the acquisition of high resolution images representing individual neurons and synapses becomes possible. For this, the brain tissue samples are sliced using a diamond knife and imaged with electron-microscopy (EM). However, the technique achieves a low resolution in the cutting direction, due to limitations of the mechanical process, making a direct visualization of a dataset difficult. We aim to increase the depth resolution of the volume by adding new image slices interpolated from the existing ones, without requiring modifications to the EM image-capturing method. As classical interpolation methods do not provide satisfactory results on this type of data, the current paper proposes a re-framing of the problem in terms of motion volumes, considering the depth axis as a temporal axis. An optical flow method is adapted to estimate the motion vectors of pixels in the EM images, and this information is used to compute and insert multiple new images at certain depths in the volume. We evaluate the visualization results in comparison with interpolation methods currently used on EM data, transforming the highly anisotropic original dataset into a dataset with a larger depth resolution. The interpolation based on optical flow better reveals neurite structures with realistic undistorted shapes, and helps to easier map neuronal connections

CiteSeerX

Crossref

IPAPI: Designing an Improved Provenance API

Author: Andrew Rice
Andy Hopper
Lucian Carata
Ripduman Sohan
Publication venue
Publication date: 01/01/2013
Field of study

Abstract We investigate the main limitations imposed by existing provenance systems in the development of provenanceaware applications. In the case of disclosed provenance APIs, most of those limitations can be traced back to the inability to integrate provenance from different sources, layers and of different granularities into a coherent view of data production. We consider possible solutions in the design of an Improved Provenance API (IPAPI), based on a general model of how different system entities interact to generate, accumulate or propagate provenance. The resulting architecture enables a whole new range of provenance capture scenarios, for which available APIs do not provide adequate support

CiteSeerX

Research data supporting: "Shadow Kernels: A General Mechanism For Kernel Specialization in Existing Operating Systems"

Author: Chick Oliver R. A.
Carata Lucian
Snee James
Balakrishnan Nikilesh
Sohan Ripduman
Rice Andrew
Hopper Andy
Publication venue: University of Cambridge
Publication date: 01/01/2015
Field of study

Research data supporting "Shadow Kernels: A General Mechanism For Kernel Specialization in Existing Operating Systems".This work was supported by the EPSRC [grant number EP/K503009/1]

Biodiversity Heritage Library OAI Repository

Apollo (Cambridge)

Recommended from our members

Research data supporting "Soroban: Attributing Latency in Virtualized Environments"

Author: Carata Lucian
Chick Oliver R. A.
Faragher Ramsey M.
Hopper Andy
Rice Andrew
Snee James
Sohan Ripduman
Publication venue: USENIX Workshop on Hot Topics in Cloud Computing (HotCloud)
Publication date: 03/06/2015
Field of study

This is data required for reproducing the figures in the "Soroban: Attributing Latency in Virtualized Environments" paper. It consists of two archives: [rscfl-exp.zip] containing (i) raw data from kernel level measurements of lighttpd serving requests while running in a virtualized environment (in json format, the .dat and .mdat metadata files) (ii) post-processed measurements with the data filtered and added into python pandas.DataFrame objects for querying and plotting (in standard binary python pickle serialization format, .sdat files) (iii) attribution model files containing serializations of python sklearn.gaussian_process.GaussianProcess objects after training (in standard binary python pickle serialization format, .pickle files) (iv) python scripts for processing and plotting the data (.py files) AND [rscfl-repr.zip] containing (i) experiment definition files for figures in the paper, containing the information required to reproduce them (.json format) (ii) python script for reproducing experiments (rscfl_exp.py)Additional information: As an example, for reproducing Figure 3 from the paper, one needs to unpack the two archives (rscfl-repr and rscfl-exp) and run $ ./rscfl_exp.py -c fig3.config.json -s /path/to/rscfl_exp --fg_load_vm localhostEPSRC EP/K503009/

Apollo (Cambridge)