Search CORE

29,987 research outputs found

EERTREE: An Efficient Data Structure for Processing Palindromes in Strings

Author: Rubinchik Mikhail
Shur Arseny M.
Publication venue
Publication date: 17/08/2015
Field of study

We propose a new linear-size data structure which provides a fast access to all palindromic substrings of a string or a set of strings. This structure inherits some ideas from the construction of both the suffix trie and suffix tree. Using this structure, we present simple and efficient solutions for a number of problems involving palindromes.Comment: 21 pages, 2 figures. Accepted to IWOCA 201

arXiv.org e-Print Archive

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Optimal Parallel Construction of Minimal Suffix and Factor Automata

Author: Breslauer Dany
Hariharan Ramesh
Publication venue: 'Aarhus University Library'
Publication date: 01/01/1995
Field of study

This paper gives optimal parallel algorithms for the construction of the smallest deterministic finite automata recognizing all the suffixes and the factors of a string. The algorithms use recently discovered optimal parallel suffix tree construction algorithms together with data structures for the efficient manipulation of trees, exploiting the well known relation between suffix and factor automata and suffix trees

CiteSeerX

Tidsskrift.dk (Det Kongelige Bibliotek)

MPG.PuRe

Suffix Structures and Circular Pattern Problems

Author: Lin Jie
Publication venue: The Research Repository @ WVU
Publication date: 01/08/2011
Field of study

The suffix tree is a data structure used to represent all the suffixes in a string. However, a major problem with the suffix tree is its practical space requirement. In this dissertation, we propose an efficient data structure -- the virtual suffix tree (VST) -- which requires less space than other recently proposed data structures for suffix trees and suffix arrays. On average, the space requirement (including that for suffix arrays and suffix links) is 13.8n bytes for the regular VST, and 12.05n bytes in its compact form, where n is the length of the sequence.;Markov models are very popular for modeling complex sequences. In this dissertation, we present the probabilistic suffix array (PSA), a space-efficient alternative to the probabilistic suffix tree (PST) used to represent Markov models. The PSA provides all the capabilities of the PST, such as learning and prediction, and maintains the same linear time construction (linearity with respect to sequence length). The PSA, however, has a significantly smaller memory requirement than the PST, for both the construction stage, and at the time of usage.;Using the proposed suffix data structures, we study the circular pattern matching (CPM) problem. We provide a linear time, linear space algorithm to solve the exact circular pattern matching problem. We then present four algorithms to address the approximate circular pattern matching (ACPM) problem. Our bidirectional ACPM algorithm provides the best time complexity when compared with other algorithms proposed in the literature. Further, we define the circular pattern discovery (CPD) problem and present algorithms to solve this problem. Using the proposed circular pattern matching algorithms, we perform experiments on computational analysis and function prediction for multidomain proteins

The Research Repository @ WVU (West Virginia University)

The Suffix Tree of a Tree and Minimizing Sequential Transducers

Author: Breslauer Dany
Publication venue: 'Aarhus University Library'
Publication date: 17/06/1995
Field of study

This paper gives a linear-time algorithm for the construction of thesuffix tree of a tree. The suffix tree of a tree is used to obtain an efficientalgorithm for the minimization of sequential transducers

Tidsskrift.dk (Det Kongelige Bibliotek)