72 research outputs found
A Note on Easy and Efficient Computation of Full Abelian Periods of a Word
Constantinescu and Ilie (Bulletin of the EATCS 89, 167-170, 2006) introduced
the idea of an Abelian period with head and tail of a finite word. An Abelian
period is called full if both the head and the tail are empty. We present a
simple and easy-to-implement -time algorithm for computing all
the full Abelian periods of a word of length over a constant-size alphabet.
Experiments show that our algorithm significantly outperforms the
algorithm proposed by Kociumaka et al. (Proc. of STACS, 245-256, 2013) for the
same problem.Comment: Accepted for publication in Discrete Applied Mathematic
Sequence searching allowing for non-overlapping adjacent unbalanced translocations
Unbalanced translocations are among the most frequent chromosomal alterations, accounted for 30% of all losses of heterozygosity, a major genetic event causing inactivation of tumor suppressor genes. Despite of their central role in genomic sequence analysis, little attention has been devoted to the problem of matching sequences allowing for this kind of chromosomal alteration.
In this paper we investigate the approximate string matching problem when the edit operations are non-overlapping unbalanced translocations of adjacent factors. In particular, we first present a O(nm3)-time and O(m2)-space algorithm based on the dynamic-programming approach. Then we improve our first result by designing a second solution which makes use of the Directed Acyclic Word Graph of the pattern. In particular, we show that under the assumptions of equiprobability and independence of characters, our algorithm has a O(n log2σ m) average time complexity, for an alphabet of size σ, still maintaining the O(nm3)-time and the O(m2)-space complexity in the worst case. To the best of our knowledge this is the first solution in literature for the approximate string matching problem allowing for unbalanced translocations of factors
String Periods in the Order-Preserving Model
The order-preserving model (op-model, in short) was introduced quite recently but has already attracted significant attention because of its applications in data analysis. We introduce several types of periods in this setting (op-periods). Then we give algorithms to compute these periods in time O(n), O(n log log n), O(n log^2 log n/log log log n), O(n log n) depending on the type of periodicity. In the most general variant the number of different periods can be as big as Omega(n^2), and a compact representation is needed. Our algorithms require novel combinatorial insight into the properties of such periods
Using human computation in dead-zone based 2D pattern matching
Abstract. This paper examines the application of human computation (HC) to twodimensional image pattern matching. The two main goals of our algorithm are to use turks as the processing units to perform an efficient pattern match attempt on a subsection of an image, and to divide the work using a version of dead-zone based pattern matching. In this approach, human computation presents an alternative to machine learning by outsourcing computationally difficult work to humans, while the dead-zone search offers an efficient search paradigm open to parallelization-making the combination a powerful approach for searching for patterns in two-dimensional images
Optimal-Hash Exact String Matching Algorithms
String matching is the problem of finding all the occurrences of a pattern in
a text. We propose improved versions of the fast family of string matching
algorithms based on hashing -grams. The improvement consists of considering
minimal values such that each -grams of the pattern has a unique hash
value. The new algorithms are fastest than algorithm of the HASH family for
short patterns on large size alphabets.Comment: 14 page
- …