20,056 research outputs found
An output-sensitive algorithm for the minimization of 2-dimensional String Covers
String covers are a powerful tool for analyzing the quasi-periodicity of
1-dimensional data and find applications in automata theory, computational
biology, coding and the analysis of transactional data. A \emph{cover} of a
string is a string for which every letter of lies within some
occurrence of . String covers have been generalized in many ways, leading to
\emph{k-covers}, \emph{-covers}, \emph{approximate covers} and were
studied in different contexts such as \emph{indeterminate strings}.
In this paper we generalize string covers to the context of 2-dimensional
data, such as images. We show how they can be used for the extraction of
textures from images and identification of primitive cells in lattice data.
This has interesting applications in image compression, procedural terrain
generation and crystallography
Scalable Exact Parent Sets Identification in Bayesian Networks Learning with Apache Spark
In Machine Learning, the parent set identification problem is to find a set
of random variables that best explain selected variable given the data and some
predefined scoring function. This problem is a critical component to structure
learning of Bayesian networks and Markov blankets discovery, and thus has many
practical applications, ranging from fraud detection to clinical decision
support. In this paper, we introduce a new distributed memory approach to the
exact parent sets assignment problem. To achieve scalability, we derive
theoretical bounds to constraint the search space when MDL scoring function is
used, and we reorganize the underlying dynamic programming such that the
computational density is increased and fine-grain synchronization is
eliminated. We then design efficient realization of our approach in the Apache
Spark platform. Through experimental results, we demonstrate that the method
maintains strong scalability on a 500-core standalone Spark cluster, and it can
be used to efficiently process data sets with 70 variables, far beyond the
reach of the currently available solutions
An Efficient Algorithm for Enumerating Chordless Cycles and Chordless Paths
A chordless cycle (induced cycle) of a graph is a cycle without any
chord, meaning that there is no edge outside the cycle connecting two vertices
of the cycle. A chordless path is defined similarly. In this paper, we consider
the problems of enumerating chordless cycles/paths of a given graph
and propose algorithms taking time for each chordless cycle/path. In
the existing studies, the problems had not been deeply studied in the
theoretical computer science area, and no output polynomial time algorithm has
been proposed. Our experiments showed that the computation time of our
algorithms is constant per chordless cycle/path for non-dense random graphs and
real-world graphs. They also show that the number of chordless cycles is much
smaller than the number of cycles. We applied the algorithm to prediction of
NMR (Nuclear Magnetic Resonance) spectra, and increased the accuracy of the
prediction
Large induced subgraphs via triangulations and CMSO
We obtain an algorithmic meta-theorem for the following optimization problem.
Let \phi\ be a Counting Monadic Second Order Logic (CMSO) formula and t be an
integer. For a given graph G, the task is to maximize |X| subject to the
following: there is a set of vertices F of G, containing X, such that the
subgraph G[F] induced by F is of treewidth at most t, and structure (G[F],X)
models \phi.
Some special cases of this optimization problem are the following generic
examples. Each of these cases contains various problems as a special subcase:
1) "Maximum induced subgraph with at most l copies of cycles of length 0
modulo m", where for fixed nonnegative integers m and l, the task is to find a
maximum induced subgraph of a given graph with at most l vertex-disjoint cycles
of length 0 modulo m.
2) "Minimum \Gamma-deletion", where for a fixed finite set of graphs \Gamma\
containing a planar graph, the task is to find a maximum induced subgraph of a
given graph containing no graph from \Gamma\ as a minor.
3) "Independent \Pi-packing", where for a fixed finite set of connected
graphs \Pi, the task is to find an induced subgraph G[F] of a given graph G
with the maximum number of connected components, such that each connected
component of G[F] is isomorphic to some graph from \Pi.
We give an algorithm solving the optimization problem on an n-vertex graph G
in time O(#pmc n^{t+4} f(t,\phi)), where #pmc is the number of all potential
maximal cliques in G and f is a function depending of t and \phi\ only. We also
show how a similar running time can be obtained for the weighted version of the
problem. Pipelined with known bounds on the number of potential maximal
cliques, we deduce that our optimization problem can be solved in time
O(1.7347^n) for arbitrary graphs, and in polynomial time for graph classes with
polynomial number of minimal separators
BioDiVinE: A Framework for Parallel Analysis of Biological Models
In this paper a novel tool BioDiVinEfor parallel analysis of biological
models is presented. The tool allows analysis of biological models specified in
terms of a set of chemical reactions. Chemical reactions are transformed into a
system of multi-affine differential equations. BioDiVinE employs techniques for
finite discrete abstraction of the continuous state space. At that level,
parallel analysis algorithms based on model checking are provided. In the
paper, the key tool features are described and their application is
demonstrated by means of a case study
- …