6,598 research outputs found
Answering Regular Path Queries on Workflow Provenance
This paper proposes a novel approach for efficiently evaluating regular path
queries over provenance graphs of workflows that may include recursion. The
approach assumes that an execution g of a workflow G is labeled with
query-agnostic reachability labels using an existing technique. At query time,
given g, G and a regular path query R, the approach decomposes R into a set of
subqueries R1, ..., Rk that are safe for G. For each safe subquery Ri, G is
rewritten so that, using the reachability labels of nodes in g, whether or not
there is a path which matches Ri between two nodes can be decided in constant
time. The results of each safe subquery are then composed, possibly with some
small unsafe remainder, to produce an answer to R. The approach results in an
algorithm that significantly reduces the number of subqueries k over existing
techniques by increasing their size and complexity, and that evaluates each
subquery in time bounded by its input and output size. Experimental results
demonstrate the benefit of this approach
Labeling Workflow Views with Fine-Grained Dependencies
This paper considers the problem of efficiently answering reachability
queries over views of provenance graphs, derived from executions of workflows
that may include recursion. Such views include composite modules and model
fine-grained dependencies between module inputs and outputs. A novel
view-adaptive dynamic labeling scheme is developed for efficient query
evaluation, in which view specifications are labeled statically (i.e. as they
are created) and data items are labeled dynamically as they are produced during
a workflow execution. Although the combination of fine-grained dependencies and
recursive workflows entail, in general, long (linear-size) data labels, we show
that for a large natural class of workflows and views, labels are compact
(logarithmic-size) and reachability queries can be evaluated in constant time.
Experimental results demonstrate the benefit of this approach over the
state-of-the-art technique when applied for labeling multiple views.Comment: VLDB201
Coming: a Tool for Mining Change Pattern Instances from Git Commits
Software repositories such as Git have become a relevant source of
information for software engineer researcher. For instance, the detection of
Commits that fulfill a given criterion (e.g., bugfixing commits) is one of the
most frequent tasks done to understand the software evolution. However, to our
knowledge, there is not open-source tools that, given a Git repository, returns
all the instances of a given change pattern. In this paper we present Coming, a
tool that takes an input a Git repository and mines instances of change
patterns on each commit. For that, Coming computes fine-grained changes between
two consecutive revisions, analyzes those changes to detect if they correspond
to an instance of a change pattern (specified by the user using XML), and
finally, after analyzing all the commits, it presents a) the frequency of code
changes and b) the instances found on each commit. We evaluate Coming on a set
of 28 pairs of revisions from Defects4J, finding instances of change patterns
that involve If conditions on 26 of them
Extracting Build Changes with BUILDDIFF
Build systems are an essential part of modern software engineering projects.
As software projects change continuously, it is crucial to understand how the
build system changes because neglecting its maintenance can lead to expensive
build breakage. Recent studies have investigated the (co-)evolution of build
configurations and reasons for build breakage, but they did this only on a
coarse grained level. In this paper, we present BUILDDIFF, an approach to
extract detailed build changes from MAVEN build files and classify them into 95
change types. In a manual evaluation of 400 build changing commits, we show
that BUILDDIFF can extract and classify build changes with an average precision
and recall of 0.96 and 0.98, respectively. We then present two studies using
the build changes extracted from 30 open source Java projects to study the
frequency and time of build changes. The results show that the top 10 most
frequent change types account for 73% of the build changes. Among them, changes
to version numbers and changes to dependencies of the projects occur most
frequently. Furthermore, our results show that build changes occur frequently
around releases. With these results, we provide the basis for further research,
such as for analyzing the (co-)evolution of build files with other artifacts or
improving effort estimation approaches. Furthermore, our detailed change
information enables improvements of refactoring approaches for build
configurations and improvements of models to identify error-prone build files.Comment: Accepted at the International Conference of Mining Software
Repositories (MSR), 201
Queensland University of Technology at TREC 2005
The Information Retrieval and Web Intelligence (IR-WI) research group is a research team at the Faculty of Information Technology, QUT, Brisbane, Australia. The IR-WI group participated in the Terabyte and Robust track at TREC 2005, both for the first time. For the Robust track we applied our existing information retrieval system that was originally designed for use with structured (XML) retrieval to the domain of document retrieval. For the Terabyte track we experimented with an open source IR system, Zettair and performed two types of experiments. First, we compared Zettair’s performance on both a high-powered supercomputer and a distributed system across seven midrange personal computers. Second, we compared Zettair’s performance when a standard TREC title is used, compared with a natural language query, and a query expanded with synonyms. We compare the systems both in terms of efficiency and retrieval performance. Our results indicate that the distributed system is faster than the supercomputer, while slightly decreasing retrieval performance, and that natural language queries also slightly decrease retrieval performance, while our query expansion technique significantly decreased performance
Context-driven progressive enhancement of mobile web applications: a multicriteria decision-making approach
Personal computing has become all about mobile and embedded devices. As a result, the adoption rate of smartphones is rapidly increasing and this trend has set a need for mobile applications to be available at anytime, anywhere and on any device. Despite the obvious advantages of such immersive mobile applications, software developers are increasingly facing the challenges related to device fragmentation. Current application development solutions are insufficiently prepared for handling the enormous variety of software platforms and hardware characteristics covering the mobile eco-system. As a result, maintaining a viable balance between development costs and market coverage has turned out to be a challenging issue when developing mobile applications. This article proposes a context-aware software platform for the development and delivery of self-adaptive mobile applications over the Web. An adaptive application composition approach is introduced, capable of autonomously bypassing context-related fragmentation issues. This goal is achieved by incorporating and validating the concept of fine-grained progressive application enhancements based on a multicriteria decision-making strategy
Corpora and evaluation tools for multilingual named entity grammar development
We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats
Some issues in the 'archaeology' of software evolution
During a software project's lifetime, the software goes through many changes, as components are added, removed and modified to fix bugs and add new features. This paper is intended as a lightweight introduction to some of the issues arising from an `archaeological' investigation of software evolution. We use our own work to look at some of the challenges faced, techniques used, findings obtained, and lessons learnt when measuring and visualising the historical changes that happen during the evolution of software
- …