4,556 research outputs found
Matching Subsequences in Trees
Given two rooted, labeled trees and the tree path subsequence problem
is to determine which paths in are subsequences of which paths in . Here
a path begins at the root and ends at a leaf. In this paper we propose this
problem as a useful query primitive for XML data, and provide new algorithms
improving the previously best known time and space bounds.Comment: Minor correction of typos, et
Asymptotically Unbiased Estimation of Exposure Odds Ratios in Complete Records Logistic Regression.
Missing data are a commonly occurring threat to the validity and efficiency of epidemiologic studies. Perhaps the most common approach to handling missing data is to simply drop those records with 1 or more missing values, in so-called "complete records" or "complete case" analysis. In this paper, we bring together earlier-derived yet perhaps now somewhat neglected results which show that a logistic regression complete records analysis can provide asymptotically unbiased estimates of the association of an exposure of interest with an outcome, adjusted for a number of confounders, under a surprisingly wide range of missing-data assumptions. We give detailed guidance describing how the observed data can be used to judge the plausibility of these assumptions. The results mean that in large epidemiologic studies which are affected by missing data and analyzed by logistic regression, exposure associations may be estimated without bias in a number of settings where researchers might otherwise assume that bias would occur
Molecular Model of Dynamic Social Network Based on E-mail communication
In this work we consider an application of physically inspired sociodynamical model to the modelling of the evolution of email-based social network. Contrary to the standard approach of sociodynamics, which assumes expressing of system dynamics with heuristically defined simple rules, we postulate the inference of these rules from the real data and their application within a dynamic molecular model. We present how to embed the n-dimensional social space in Euclidean one. Then, inspired by the Lennard-Jones potential, we define a data-driven social potential function and apply the resultant force to a real e-mail communication network in a course of a molecular simulation, with network nodes taking on the role of interacting particles. We discuss all steps of the modelling process, from data preparation, through embedding and the molecular simulation itself, to transformation from the embedding space back to a graph structure. The conclusions, drawn from examining the resultant networks in stable, minimum-energy states, emphasize the role of the embedding process projecting the non–metric social graph into the Euclidean space, the significance of the unavoidable loss of information connected with this procedure and the resultant preservation of global rather than local properties of the initial network. We also argue applicability of our method to some classes of problems, while also signalling the areas which require further research in order to expand this applicability domain
Compact Drawings of 1-Planar Graphs with Right-Angle Crossings and Few Bends
We study the following classes of beyond-planar graphs: 1-planar, IC-planar,
and NIC-planar graphs. These are the graphs that admit a 1-planar, IC-planar,
and NIC-planar drawing, respectively. A drawing of a graph is 1-planar if every
edge is crossed at most once. A 1-planar drawing is IC-planar if no two pairs
of crossing edges share a vertex. A 1-planar drawing is NIC-planar if no two
pairs of crossing edges share two vertices. We study the relations of these
beyond-planar graph classes (beyond-planar graphs is a collective term for the
primary attempts to generalize the planar graphs) to right-angle crossing (RAC)
graphs that admit compact drawings on the grid with few bends. We present four
drawing algorithms that preserve the given embeddings. First, we show that
every -vertex NIC-planar graph admits a NIC-planar RAC drawing with at most
one bend per edge on a grid of size . Then, we show that
every -vertex 1-planar graph admits a 1-planar RAC drawing with at most two
bends per edge on a grid of size . Finally, we make two
known algorithms embedding-preserving; for drawing 1-planar RAC graphs with at
most one bend per edge and for drawing IC-planar RAC graphs straight-line
Fast Arc-Annotated Subsequence Matching in Linear Space
An arc-annotated string is a string of characters, called bases, augmented
with a set of pairs, called arcs, each connecting two bases. Given
arc-annotated strings and the arc-preserving subsequence problem is to
determine if can be obtained from by deleting bases from . Whenever
a base is deleted any arc with an endpoint in that base is also deleted.
Arc-annotated strings where the arcs are ``nested'' are a natural model of RNA
molecules that captures both the primary and secondary structure of these. The
arc-preserving subsequence problem for nested arc-annotated strings is basic
primitive for investigating the function of RNA molecules. Gramm et al. [ACM
Trans. Algorithms 2006] gave an algorithm for this problem using time
and space, where and are the lengths of and , respectively. In
this paper we present a new algorithm using time and space,
thereby matching the previous time bound while significantly reducing the space
from a quadratic term to linear. This is essential to process large RNA
molecules where the space is likely to be a bottleneck. To obtain our result we
introduce several novel ideas which may be of independent interest for related
problems on arc-annotated strings.Comment: To appear in Algoritmic
Effective representation of RT-LOTOS terms by finite time petri nets
The paper describes a transformational approach for the
specification and formal verification of concurrent and real-time systems. At upper level, one system is specified using the timed process algebra RT-LOTOS. The output of the proposed transformation is a Time Petri net (TPN). The paper particularly shows how a TPN can be automatically constructed from an RT-LOTOS specification using a compositionally defined mapping. The proof of the translation consistency is sketched in the paper and developed in [1]. The RT-LOTOS to TPN translation patterns formalized in the paper are being implemented. in a prototype tool. This enables reusing TPNs verification techniques and tools for the profit of RT-LOTOS
Exploiting Behavioral Hierarchy for Efficient Model Checking
Inspired by the success of model checking in hardware and protocol verification, model checking techniques for software have been the focus of a lot of research in the last few years [5,3,2,6]. Model checking can be applied only to relatively small models due to its inherently high computational requirements, and there are two complementary trends to address scalability. The model extraction approach, exemplified by projects such as Bandera [6] and SLAM [3], involves constructing inputs to model checkers by abstracting programs written in languages such as C and Java. The model-based design approach, exemplified by modeling notations such as Statecharts [7], promotes design using high-level models that are compiled into code. Our research agenda is to develop model checking techniques for model-based design of software.
Modern software design languages promote hierarchy as one of the key constructs for structuring complex specifications. The input language to our model checker is based on hierarchic reactive modules [1]. This choice was motivated by the fact that, unlike STATECHARTS and other languages, in hierarchic reactive modules, the notion of hierarchy is semantic with an observational trace-based semantics and a notion of refinement with assume-guarantee rules. The first contribution of this paper is the Hermes toolkit that implements hierarchic reactive modules. Our implementation has a visual front-end and XML-based back-end, consistent with modern software design tools, and is in Java.
There are two basic techniques for reachability analysis. Enumerative model checkers such as SPIN [8] perform an on-the-fly exploration of the state-space using a depth-first search, while symbolic model checkers such as SMV [9] perform a breadth-first search by manipulating sets of states, rather than individual states, encoded typically by ordered binary (or multi-valued) decision diagrams. Since the two approaches are incomparable, and have been shown to be successful, Hermes supports both enumerative and symbolic reachability analysis. In this paper, we report progress on exploiting the structuring information in the behavioral hierarchy of the input model to speed up the exploration of reachable state-space of the model for both the approaches. More information about the tool is available at http://www.cis.upenn.edu/sdrl/hermes
String Indexing for Patterns with Wildcards
We consider the problem of indexing a string of length to report the
occurrences of a query pattern containing characters and wildcards.
Let be the number of occurrences of in , and the size of
the alphabet. We obtain the following results.
- A linear space index with query time .
This significantly improves the previously best known linear space index by Lam
et al. [ISAAC 2007], which requires query time in the worst case.
- An index with query time using space , where is the maximum number of wildcards allowed in the pattern.
This is the first non-trivial bound with this query time.
- A time-space trade-off, generalizing the index by Cole et al. [STOC 2004].
We also show that these indexes can be generalized to allow variable length
gaps in the pattern. Our results are obtained using a novel combination of
well-known and new techniques, which could be of independent interest
Revisiting Semantics of Interactions for Trace Validity Analysis
Interaction languages such as MSC are often associated with formal semantics
by means of translations into distinct behavioral formalisms such as automatas
or Petri nets. In contrast to translational approaches we propose an
operational approach. Its principle is to identify which elementary
communication actions can be immediately executed, and then to compute, for
every such action, a new interaction representing the possible continuations to
its execution. We also define an algorithm for checking the validity of
execution traces (i.e. whether or not they belong to an interaction's
semantics). Algorithms for semantic computation and trace validity are analyzed
by means of experiments.Comment: 18 pages of contents and 2 pages for references, 10 figures.
Published in ETAPS-FASE2020 : "23rd International Conference on Fundamental
Approaches to Software Engineering" in the "research papers" categor
A Distributed Multilevel Force-directed Algorithm
The wide availability of powerful and inexpensive cloud computing services
naturally motivates the study of distributed graph layout algorithms, able to
scale to very large graphs. Nowadays, to process Big Data, companies are
increasingly relying on PaaS infrastructures rather than buying and maintaining
complex and expensive hardware. So far, only a few examples of basic
force-directed algorithms that work in a distributed environment have been
described. Instead, the design of a distributed multilevel force-directed
algorithm is a much more challenging task, not yet addressed. We present the
first multilevel force-directed algorithm based on a distributed vertex-centric
paradigm, and its implementation on Giraph, a popular platform for distributed
graph algorithms. Experiments show the effectiveness and the scalability of the
approach. Using an inexpensive cloud computing service of Amazon, we draw
graphs with ten million edges in about 60 minutes.Comment: Appears in the Proceedings of the 24th International Symposium on
Graph Drawing and Network Visualization (GD 2016
- …