765 research outputs found
A Minimal Periods Algorithm with Applications
Kosaraju in ``Computation of squares in a string'' briefly described a
linear-time algorithm for computing the minimal squares starting at each
position in a word. Using the same construction of suffix trees, we generalize
his result and describe in detail how to compute in O(k|w|)-time the minimal
k-th power, with period of length larger than s, starting at each position in a
word w for arbitrary exponent and integer . We provide the
complete proof of correctness of the algorithm, which is somehow not completely
clear in Kosaraju's original paper. The algorithm can be used as a sub-routine
to detect certain types of pseudo-patterns in words, which is our original
intention to study the generalization.Comment: 14 page
The stochastic matching problem
The matching problem plays a basic role in combinatorial optimization and in
statistical mechanics. In its stochastic variants, optimization decisions have
to be taken given only some probabilistic information about the instance. While
the deterministic case can be solved in polynomial time, stochastic variants
are worst-case intractable. We propose an efficient method to solve stochastic
matching problems which combines some features of the survey propagation
equations and of the cavity method. We test it on random bipartite graphs, for
which we analyze the phase diagram and compare the results with exact bounds.
Our approach is shown numerically to be effective on the full range of
parameters, and to outperform state-of-the-art methods. Finally we discuss how
the method can be generalized to other problems of optimization under
uncertainty.Comment: Published version has very minor change
Wave Energy: a Pacific Perspective
This is the author's peer-reviewed final manuscript, as accepted by the publisher. The published article is copyrighted by The Royal Society and can be found at: http://rsta.royalsocietypublishing.org/.This paper illustrates the status of wave energy development in Pacific Rim countries by characterizing the available resource and introducing the region‟s current and potential future leaders in wave energy converter development. It also describes the existing licensing and permitting process as well as potential environmental concerns. Capabilities of Pacific Ocean testing facilities are described in addition to the region‟s vision of the future of wave energy
Searching of gapped repeats and subrepetitions in a word
A gapped repeat is a factor of the form where and are nonempty
words. The period of the gapped repeat is defined as . The gapped
repeat is maximal if it cannot be extended to the left or to the right by at
least one letter with preserving its period. The gapped repeat is called
-gapped if its period is not greater than . A
-subrepetition is a factor which exponent is less than 2 but is not
less than (the exponent of the factor is the quotient of the length
and the minimal period of the factor). The -subrepetition is maximal if
it cannot be extended to the left or to the right by at least one letter with
preserving its minimal period. We reveal a close relation between maximal
gapped repeats and maximal subrepetitions. Moreover, we show that in a word of
length the number of maximal -gapped repeats is bounded by
and the number of maximal -subrepetitions is bounded by
. Using the obtained upper bounds, we propose algorithms for
finding all maximal -gapped repeats and all maximal
-subrepetitions in a word of length . The algorithm for finding all
maximal -gapped repeats has time complexity for the case
of constant alphabet size and time complexity for the
general case. For finding all maximal -subrepetitions we propose two
algorithms. The first algorithm has time
complexity for the case of constant alphabet size and time complexity for the general case. The
second algorithm has
expected time complexity
Bethe Ansatz in the Bernoulli Matching Model of Random Sequence Alignment
For the Bernoulli Matching model of sequence alignment problem we apply the
Bethe ansatz technique via an exact mapping to the 5--vertex model on a square
lattice. Considering the terrace--like representation of the sequence alignment
problem, we reproduce by the Bethe ansatz the results for the averaged length
of the Longest Common Subsequence in Bernoulli approximation. In addition, we
compute the average number of nucleation centers of the terraces.Comment: 14 pages, 5 figures (some points are clarified
Duel and sweep algorithm for order-preserving pattern matching
Given a text and a pattern over alphabet , the classic exact
matching problem searches for all occurrences of pattern in text .
Unlike exact matching problem, order-preserving pattern matching (OPPM)
considers the relative order of elements, rather than their real values. In
this paper, we propose an efficient algorithm for OPPM problem using the
"duel-and-sweep" paradigm. Our algorithm runs in time in
general and time under an assumption that the characters in a string
can be sorted in linear time with respect to the string size. We also perform
experiments and show that our algorithm is faster that KMP-based algorithm.
Last, we introduce the two-dimensional order preserved pattern matching and
give a duel and sweep algorithm that runs in time for duel stage and
time for sweeping time with preprocessing time.Comment: 13 pages, 5 figure
Time-frequency scaling transformation of the phonocardiogram based of the matching pursuit method.
International audienceA time-frequency scaling transformation based on the matching pursuit (MP) method is developed for the phonocardiogram (PCG). The MP method decomposes a signal into a series of time-frequency atoms by using an iterative process. The modification of the time scale of the PCG can be performed without perceptible change in its spectral characteristics. It is also possible to modify the frequency scale without changing the temporal properties. The technique has been tested on 11 PCG's containing heart sounds and different murmurs. A scaling/inverse-scaling procedure was used for quantitative evaluation of the scaling performance. Both the spectrogram and a MP-based Wigner distribution were used for visual comparison in the time-frequency domain. The results showed that the technique is suitable and effective for the time-frequency scale transformation of both the transient property of the heart sounds and the more complex random property of the murmurs. It is also shown that the effectiveness of the method is strongly related to the optimization of the parameters used for the decomposition of the signals
Suffix Tree of Alignment: An Efficient Index for Similar Data
We consider an index data structure for similar strings. The generalized
suffix tree can be a solution for this. The generalized suffix tree of two
strings and is a compacted trie representing all suffixes in and
. It has leaves and can be constructed in time.
However, if the two strings are similar, the generalized suffix tree is not
efficient because it does not exploit the similarity which is usually
represented as an alignment of and .
In this paper we propose a space/time-efficient suffix tree of alignment
which wisely exploits the similarity in an alignment. Our suffix tree for an
alignment of and has leaves where is the sum of
the lengths of all parts of different from and is the sum of the
lengths of some common parts of and . We did not compromise the pattern
search to reduce the space. Our suffix tree can be searched for a pattern
in time where is the number of occurrences of in and
. We also present an efficient algorithm to construct the suffix tree of
alignment. When the suffix tree is constructed from scratch, the algorithm
requires time where is the sum of the lengths
of other common substrings of and . When the suffix tree of is
already given, it requires time.Comment: 12 page
Optimality Clue for Graph Coloring Problem
In this paper, we present a new approach which qualifies or not a solution
found by a heuristic as a potential optimal solution. Our approach is based on
the following observation: for a minimization problem, the number of admissible
solutions decreases with the value of the objective function. For the Graph
Coloring Problem (GCP), we confirm this observation and present a new way to
prove optimality. This proof is based on the counting of the number of
different k-colorings and the number of independent sets of a given graph G.
Exact solutions counting problems are difficult problems (\#P-complete).
However, we show that, using only randomized heuristics, it is possible to
define an estimation of the upper bound of the number of k-colorings. This
estimate has been calibrated on a large benchmark of graph instances for which
the exact number of optimal k-colorings is known. Our approach, called
optimality clue, build a sample of k-colorings of a given graph by running many
times one randomized heuristic on the same graph instance. We use the
evolutionary algorithm HEAD [Moalic et Gondran, 2018], which is one of the most
efficient heuristic for GCP. Optimality clue matches with the standard
definition of optimality on a wide number of instances of DIMACS and RBCII
benchmarks where the optimality is known. Then, we show the clue of optimality
for another set of graph instances. Optimality Metaheuristics Near-optimal
k-Abelian Pattern Matching
Two words are called -abelian equivalent, if they share the same multiplicities for all factors of length at most . We present an optimal linear time algorithm for identifying all occurrences of factors in a text that are -abelian equivalent to some pattern. Moreover, an optimal algorithm for finding the largest for which two words are -abelian equivalent is given. Solutions for various online versions of the -abelian pattern matching problem are also proposed
- …