14 research outputs found
Understanding Legacy Workflows through Runtime Trace Analysis
abstract: When scientific software is written to specify processes, it takes the form of a workflow, and is often written in an ad-hoc manner in a dynamic programming language. There is a proliferation of legacy workflows implemented by non-expert programmers due to the accessibility of dynamic languages. Unfortunately, ad-hoc workflows lack a structured description as provided by specialized management systems, making ad-hoc workflow maintenance and reuse difficult, and motivating the need for analysis methods. The analysis of ad-hoc workflows using compiler techniques does not address dynamic languages - a program has so few constrains that its behavior cannot be predicted. In contrast, workflow provenance tracking has had success using run-time techniques to record data. The aim of this work is to develop a new analysis method for extracting workflow structure at run-time, thus avoiding issues with dynamics.
The method captures the dataflow of an ad-hoc workflow through its execution and abstracts it with a process for simplifying repetition. An instrumentation system first processes the workflow to produce an instrumented version, capable of logging events, which is then executed on an input to produce a trace. The trace undergoes dataflow construction to produce a provenance graph. The dataflow is examined for equivalent regions, which are collected into a single unit. The workflow is thus characterized in terms of its treatment of an input. Unlike other methods, a run-time approach characterizes the workflow's actual behavior; including elements which static analysis cannot predict (for example, code dynamically evaluated based on input parameters). This also enables the characterization of dataflow through external tools.
The contributions of this work are: a run-time method for recording a provenance graph from an ad-hoc Python workflow, and a method to analyze the structure of a workflow from provenance. Methods are implemented in Python and are demonstrated on real world Python workflows. These contributions enable users to derive graph structure from workflows. Empowered by a graphical view, users can better understand a legacy workflow. This makes the wealth of legacy ad-hoc workflows accessible, enabling workflow reuse instead of investing time and resources into creating a workflow.Dissertation/ThesisMasters Thesis Computer Science 201
Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)
ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability
Proceedings of the 4th International Conference on Principles and Practices of Programming in Java
This book contains the proceedings of the 4th international conference on principles and practices of programming in Java. The conference focuses on the different aspects of the Java programming language and its applications
Performance of Computer Systems; Proceedings of the 4th International Symposium on Modelling and Performance Evaluation of Computer Systems, Vienna, Austria, February 6-8, 1979
These proceedings are a collection of contributions to computer system performance, selected by the usual refereeing process from papers submitted to the symposium, as well as a few invited papers representing significant novel contributions made during the last year. They represent the thrust and vitality of the subject as well as its capacity to identify important basic problems and major application areas. The main methodological problems appear in the underlying queueing theoretic aspects, in the deterministic analysis of waiting time phenomena, in workload characterization and representation, in the algorithmic aspects of model processing, and in the analysis of measurement data. Major areas for applications are computer architectures, data bases, computer networks, and capacity planning.
The international importance of the area of computer system performance was well reflected at the symposium by participants from 19 countries. The mixture of participants was also evident in the institutions which they represented: 35% from universities, 25% from governmental research organizations, but also 30% from industry and 10% from non-research government bodies. This proves that the area is reaching a stage of maturity where it can contribute directly to progress in practical problems