10,921 research outputs found
Model Checking Parse Trees
Parse trees are fundamental syntactic structures in both computational
linguistics and compilers construction. We argue in this paper that, in both
fields, there are good incentives for model-checking sets of parse trees for
some word according to a context-free grammar. We put forward the adequacy of
propositional dynamic logic (PDL) on trees in these applications, and study as
a sanity check the complexity of the corresponding model-checking problem:
although complete for exponential time in the general case, we find natural
restrictions on grammars for our applications and establish complexities
ranging from nondeterministic polynomial time to polynomial space in the
relevant cases.Comment: 21 + x page
Search and Result Presentation in Scientific Workflow Repositories
We study the problem of searching a repository of complex hierarchical
workflows whose component modules, both composite and atomic, have been
annotated with keywords. Since keyword search does not use the graph structure
of a workflow, we develop a model of workflows using context-free bag grammars.
We then give efficient polynomial-time algorithms that, given a workflow and a
keyword query, determine whether some execution of the workflow matches the
query. Based on these algorithms we develop a search and ranking solution that
efficiently retrieves the top-k grammars from a repository. Finally, we propose
a novel result presentation method for grammars matching a keyword query, based
on representative parse-trees. The effectiveness of our approach is validated
through an extensive experimental evaluation
Answering Regular Path Queries on Workflow Provenance
This paper proposes a novel approach for efficiently evaluating regular path
queries over provenance graphs of workflows that may include recursion. The
approach assumes that an execution g of a workflow G is labeled with
query-agnostic reachability labels using an existing technique. At query time,
given g, G and a regular path query R, the approach decomposes R into a set of
subqueries R1, ..., Rk that are safe for G. For each safe subquery Ri, G is
rewritten so that, using the reachability labels of nodes in g, whether or not
there is a path which matches Ri between two nodes can be decided in constant
time. The results of each safe subquery are then composed, possibly with some
small unsafe remainder, to produce an answer to R. The approach results in an
algorithm that significantly reduces the number of subqueries k over existing
techniques by increasing their size and complexity, and that evaluates each
subquery in time bounded by its input and output size. Experimental results
demonstrate the benefit of this approach
SAGA: A project to automate the management of software production systems
The SAGA system is a software environment that is designed to support most of the software development activities that occur in a software lifecycle. The system can be configured to support specific software development applications using given programming languages, tools, and methodologies. Meta-tools are provided to ease configuration. The SAGA system consists of a small number of software components that are adapted by the meta-tools into specific tools for use in the software development application. The modules are design so that the meta-tools can construct an environment which is both integrated and flexible. The SAGA project is documented in several papers which are presented
Identifiability and Unmixing of Latent Parse Trees
This paper explores unsupervised learning of parsing models along two
directions. First, which models are identifiable from infinite data? We use a
general technique for numerically checking identifiability based on the rank of
a Jacobian matrix, and apply it to several standard constituency and dependency
parsing models. Second, for identifiable models, how do we estimate the
parameters efficiently? EM suffers from local optima, while recent work using
spectral methods cannot be directly applied since the topology of the parse
tree varies across sentences. We develop a strategy, unmixing, which deals with
this additional complexity for restricted classes of parsing models
- …