Search CORE

20,056 research outputs found

An output-sensitive algorithm for the minimization of 2-dimensional String Covers

Author: A Apostolico
A Apostolico
A Bacciotti
A Katok
A Muchnik
A Tychonoff
A Wlodawer
AK Brodzik
AV Aho
DE Knuth
J Kopf
JR Searle
K Perlin
L Bursill
R Middlestead
RS Bird
S Havlin
WA Sethares
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/05/2019
Field of study

String covers are a powerful tool for analyzing the quasi-periodicity of 1-dimensional data and find applications in automata theory, computational biology, coding and the analysis of transactional data. A \emph{cover} of a string

T

is a string

C

for which every letter of

T

lies within some occurrence of

C

. String covers have been generalized in many ways, leading to \emph{k-covers}, \emph{

\lambda

-covers}, \emph{approximate covers} and were studied in different contexts such as \emph{indeterminate strings}. In this paper we generalize string covers to the context of 2-dimensional data, such as images. We show how they can be used for the extraction of textures from images and identification of primitive cells in lattice data. This has interesting applications in image compression, procedural terrain generation and crystallography

arXiv.org e-Print Archive

Crossref

Scalable Exact Parent Sets Identification in Bayesian Networks Learning with Apache Spark

Author: Karan Subhadeep
Zola Jaroslaw
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/10/2017
Field of study

In Machine Learning, the parent set identification problem is to find a set of random variables that best explain selected variable given the data and some predefined scoring function. This problem is a critical component to structure learning of Bayesian networks and Markov blankets discovery, and thus has many practical applications, ranging from fraud detection to clinical decision support. In this paper, we introduce a new distributed memory approach to the exact parent sets assignment problem. To achieve scalability, we derive theoretical bounds to constraint the search space when MDL scoring function is used, and we reorganize the underlying dynamic programming such that the computational density is increased and fine-grain synchronization is eliminated. We then design efficient realization of our approach in the Apache Spark platform. Through experimental results, we demonstrate that the method maintains strong scalability on a 500-core standalone Spark cluster, and it can be used to efficiently process data sets with 70 variables, far beyond the reach of the currently available solutions

arXiv.org e-Print Archive

Crossref

An Efficient Algorithm for Enumerating Chordless Cycles and Chordless Paths

Author: A. Inokuchi
A.T. Balaban
D. Eppstein
E. Tomita
G.M. Downs
G.M. Downs
H. Satoh
H. Satoh
K. Makino
M. Wild
R.C. Read
S. Kapoor
T. Asai
T. Hanser
T. Uno
Publication venue
Publication date: 01/01/2014
Field of study

A chordless cycle (induced cycle)

C

of a graph is a cycle without any chord, meaning that there is no edge outside the cycle connecting two vertices of the cycle. A chordless path is defined similarly. In this paper, we consider the problems of enumerating chordless cycles/paths of a given graph

G=(V,E),

and propose algorithms taking

O(|E|)

time for each chordless cycle/path. In the existing studies, the problems had not been deeply studied in the theoretical computer science area, and no output polynomial time algorithm has been proposed. Our experiments showed that the computation time of our algorithms is constant per chordless cycle/path for non-dense random graphs and real-world graphs. They also show that the number of chordless cycles is much smaller than the number of cycles. We applied the algorithm to prediction of NMR (Nuclear Magnetic Resonance) spectra, and increased the accuracy of the prediction

arXiv.org e-Print Archive

Crossref

Large induced subgraphs via triangulations and CMSO

Author: Fomin Fedor
Todinca Ioan
Villanger Yngve
Publication venue
Publication date: 06/09/2013
Field of study

We obtain an algorithmic meta-theorem for the following optimization problem. Let \phi\ be a Counting Monadic Second Order Logic (CMSO) formula and t be an integer. For a given graph G, the task is to maximize |X| subject to the following: there is a set of vertices F of G, containing X, such that the subgraph G[F] induced by F is of treewidth at most t, and structure (G[F],X) models \phi. Some special cases of this optimization problem are the following generic examples. Each of these cases contains various problems as a special subcase: 1) "Maximum induced subgraph with at most l copies of cycles of length 0 modulo m", where for fixed nonnegative integers m and l, the task is to find a maximum induced subgraph of a given graph with at most l vertex-disjoint cycles of length 0 modulo m. 2) "Minimum \Gamma-deletion", where for a fixed finite set of graphs \Gamma\ containing a planar graph, the task is to find a maximum induced subgraph of a given graph containing no graph from \Gamma\ as a minor. 3) "Independent \Pi-packing", where for a fixed finite set of connected graphs \Pi, the task is to find an induced subgraph G[F] of a given graph G with the maximum number of connected components, such that each connected component of G[F] is isomorphic to some graph from \Pi. We give an algorithm solving the optimization problem on an n-vertex graph G in time O(#pmc n^{t+4} f(t,\phi)), where #pmc is the number of all potential maximal cliques in G and f is a function depending of t and \phi\ only. We also show how a similar running time can be obtained for the weighted version of the problem. Pipelined with known bounds on the number of potential maximal cliques, we deduce that our optimization problem can be solved in time O(1.7347^n) for arbitrary graphs, and in polynomial time for graph classes with polynomial number of minimal separators

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL Descartes

BioDiVinE: A Framework for Parallel Analysis of Biological Models

Author: C. Belta
David Šafránek
E.M. Clarke
Erik de Vink
G. Batt
G. Batt
H. de Jong
H. de Jong
H. Ma
Hongwu Ma
I. v Cern'a
Ion Petre
Ivana Černá
J. Barnat
J. Barnat
J. Barnat
J. Barnat
Jan Láník
Jana Fabriková
Jiří Barnat
L. Calzone
Luboš Brim
M. Feinberg
M. Kloetzer
Ralph-Johan Back
S. Hoopes et.al.
S. Khademi
S.K. Jha
Sven Dražan
X. Liu
Publication venue: 'Open Publishing Association'
Publication date: 01/10/2009
Field of study

In this paper a novel tool BioDiVinEfor parallel analysis of biological models is presented. The tool allows analysis of biological models specified in terms of a set of chemical reactions. Chemical reactions are transformed into a system of multi-affine differential equations. BioDiVinE employs techniques for finite discrete abstraction of the continuous state space. At that level, parallel analysis algorithms based on model checking are provided. In the paper, the key tool features are described and their application is demonstrated by means of a case study

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals