Search CORE

859 research outputs found

A Minimal Periods Algorithm with Applications

Author: A. Apostolico
A.O. Slisenko
A.S. Fraenkel
B. Schieber
D. Beauquier
D. Gusfield
D. Gusfield
D. Harel
D. Knuth
E.M. McCreight
J. Duval
J. Stoye
L. Ilie
M. Crochemore
M. Crochemore
M. Crochemore
M. Main
M. Main
M.G. Main
R. Kolpakov
S.R. Kosaraju
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/11/2009
Field of study

Kosaraju in ``Computation of squares in a string'' briefly described a linear-time algorithm for computing the minimal squares starting at each position in a word. Using the same construction of suffix trees, we generalize his result and describe in detail how to compute in O(k|w|)-time the minimal k-th power, with period of length larger than s, starting at each position in a word w for arbitrary exponent

k\geq2

and integer

s\geq0

. We provide the complete proof of correctness of the algorithm, which is somehow not completely clear in Kosaraju's original paper. The algorithm can be used as a sub-routine to detect certain types of pseudo-patterns in words, which is our original intention to study the generalization.Comment: 14 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

The stochastic matching problem

Author: A. Braunstein
A. Prekopa
A. Ramezanpour
C. Papadimitriou
D. Gusfield
D. Shah
D. P. Bertsekas
E. Trucco
F. Altarelli
J. Birge
L. Lovasz
R. Zecchina
R. J. Baxter
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2011
Field of study

The matching problem plays a basic role in combinatorial optimization and in statistical mechanics. In its stochastic variants, optimization decisions have to be taken given only some probabilistic information about the instance. While the deterministic case can be solved in polynomial time, stochastic variants are worst-case intractable. We propose an efficient method to solve stochastic matching problems which combines some features of the survey propagation equations and of the cavity method. We test it on random bipartite graphs, for which we analyze the phase diagram and compare the results with exact bounds. Our approach is shown numerically to be effective on the full range of parameters, and to outperform state-of-the-art methods. Finally we discuss how the method can be generalized to other problems of optimization under uncertainty.Comment: Published version has very minor change

arXiv.org e-Print Archive

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Wave Energy: a Pacific Perspective

Author: D. Gusfield
D. Okanohara
G. Manzini
J. Fischer
J. Kärkkäinen
J. Kärkkäinen
K. Sadakane
M.I. Abouelhoda
P. Ferragina
R. Dementiev
R. Sinha
S.J. Puglisi
S.J. Puglisi
T. Kasai
U. Manber
V. Mäkinen
Publication venue: The Royal Society
Publication date: 01/01/2009
Field of study

This is the author's peer-reviewed final manuscript, as accepted by the publisher. The published article is copyrighted by The Royal Society and can be found at: http://rsta.royalsocietypublishing.org/.This paper illustrates the status of wave energy development in Pacific Rim countries by characterizing the available resource and introducing the region‟s current and potential future leaders in wave energy converter development. It also describes the existing licensing and permitting process as well as potential environmental concerns. Capabilities of Pacific Ocean testing facilities are described in addition to the region‟s vision of the future of wave energy

CiteSeerX

Crossref

ScholarsArchive@OSU

RMIT Research Repository

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

On the maximal sum of exponents of runs in a string

Author: D. Gusfield
F. Franek
J. Berstel
J. Simpson
M. Crochemore
M. Crochemore
M. Crochemore
M. Crochemore
M. Crochemore
M. Giraud
M. Lothaire
R.M. Kolpakov
S.J. Puglisi
W. Rytter
W. Rytter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/03/2010
Field of study

A run is an inclusion maximal occurrence in a string (as a subinterval) of a repetition

v

with a period

p

such that

2p \le |v|

. The exponent of a run is defined as

|v|/p

and is

\ge 2

. We show new bounds on the maximal sum of exponents of runs in a string of length

n

. Our upper bound of

4.1n

is better than the best previously known proven bound of

5.6n

by Crochemore & Ilie (2008). The lower bound of

2.035n

, obtained using a family of binary words, contradicts the conjecture of Kolpakov & Kucherov (1999) that the maximal sum of exponents of runs in a string of length

n

is smaller than

2n

Comment: 7 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Crossref

Elsevier - Publisher Connector

King's Research Portal

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Scheduling Jobs in Flowshops with the Introduction of Additional Machines in the Future

Author: A. Apostolico
B. Smyth
C.J. Colbourn
D. Gusfield
E. Ukkonen
E.M. McCreight
G. Manzini
G. Manzini
J. Fischer
J. Fischer
M.A. Bender
M.I. Abouelhoda
S. Burkhardt
S.J. Puglisi
T. Kasai
U. Manber
Publication venue: Elsevier
Publication date: 01/01/2008
Field of study

This is the author's peer-reviewed final manuscript, as accepted by the publisher. The published article is copyrighted by Elsevier and can be found at: http://www.journals.elsevier.com/expert-systems-with-applications/.The problem of scheduling jobs to minimize total weighted tardiness in flowshops,\ud with the possibility of evolving into hybrid flowshops in the future, is investigated in\ud this paper. As this research is guided by a real problem in industry, the flowshop\ud considered has considerable flexibility, which stimulated the development of an\ud innovative methodology for this research. Each stage of the flowshop currently has\ud one or several identical machines. However, the manufacturing company is planning\ud to introduce additional machines with different capabilities in different stages in the\ud near future. Thus, the algorithm proposed and developed for the problem is not only\ud capable of solving the current flow line configuration but also the potential new\ud configurations that may result in the future. A meta-heuristic search algorithm based\ud on Tabu search is developed to solve this NP-hard, industry-guided problem. Six\ud different initial solution finding mechanisms are proposed. A carefully planned\ud nested split-plot design is performed to test the significance of different factors and\ud their impact on the performance of the different algorithms. To the best of our\ud knowledge, this research is the first of its kind that attempts to solve an industry-guided\ud problem with the concern for future developments

CiteSeerX

Crossref

ScholarsArchive@OSU

RMIT Research Repository

Duel and sweep algorithm for order-preserving pattern matching

Author: A Amir
D Gusfield
DE Knuth
J Kim
M Crochemore
M Kubica
MM Hasan
R Cole
RN Horspool
RS Boyer
S Cho
S Faro
T Chhabra
U Vishkin
U Vishkin
Publication venue
Publication date: 26/05/2017
Field of study

Given a text

T

and a pattern

P

over alphabet

\Sigma

, the classic exact matching problem searches for all occurrences of pattern

P

in text

T

. Unlike exact matching problem, order-preserving pattern matching (OPPM) considers the relative order of elements, rather than their real values. In this paper, we propose an efficient algorithm for OPPM problem using the "duel-and-sweep" paradigm. Our algorithm runs in

O(n + m\log m)

time in general and

O(n + m)

time under an assumption that the characters in a string can be sorted in linear time with respect to the string size. We also perform experiments and show that our algorithm is faster that KMP-based algorithm. Last, we introduce the two-dimensional order preserved pattern matching and give a duel and sweep algorithm that runs in

O(n^2)

time for duel stage and

O(n^2 m)

time for sweeping time with

O(m^3)

preprocessing time.Comment: 13 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Bethe Ansatz in the Bernoulli Matching Model of Random Sequence Alignment

Author: A. M. Vershik
D. Gusfield
D. Sankoff
J. M. Hammersley
Kirone Mallick
M. Ablowitz
M. S. Waterman
R. Dubrin
R. J. Baxter
R. Wagner
S. F. Altschul
S. M. Ulam
Satya N. Majumdar
Sergei Nechaev
V. Dancik
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2008
Field of study

For the Bernoulli Matching model of sequence alignment problem we apply the Bethe ansatz technique via an exact mapping to the 5--vertex model on a square lattice. Considering the terrace--like representation of the sequence alignment problem, we reproduce by the Bethe ansatz the results for the averaged length of the Longest Common Subsequence in Bernoulli approximation. In addition, we compute the average number of nucleation centers of the terraces.Comment: 14 pages, 5 figures (some points are clarified

arXiv.org e-Print Archive

Crossref

HAL-CEA

Two algorithms for the student-project allocation problem

Author: Anwar
Brassard
David F. Manlove
David J. Abraham
Eguchi
Fleiner
Fleiner
Gale
Gale
Gusfield
Irving
Irving
Manlove
Manlove
Ng
Proll
Robert W. Irving
Romero-Medina
Roth
Roth
Roth
Teo
Teo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

We study the Student-Project Allocation problem (SPA), a generalisation of the classical Hospitals / Residents problem (HR). An instance of SPA involves a set of students, projects and lecturers. Each project is offered by a unique lecturer, and both projects and lecturers have capacity constraints. Students have preferences over projects, whilst lecturers have preferences over students. We present two optimal linear-time algorithms for allocating students to projects, subject to the preference and capacity constraints. In particular, each algorithm finds a stable matching of students to projects. Here, the concept of stability generalises the stability definition in the HR context. The stable matching produced by the first algorithm is simultaneously best-possible for all students, whilst the one produced by the second algorithm is simultaneously best-possible for all lecturers. We also prove some structural results concerning the set of stable matchings in a given instance of SPA. The SPA problem model that we consider is very general and has applications to a range of different contexts besides student-project allocation

CiteSeerX

Elsevier - Publisher Connector

Crossref

Enlighten

Time-frequency scaling transformation of the phonocardiogram based of the matching pursuit method.

Author: A.G. Clark
A.J. Jeffreys
B. Padhukasahasram
C. Carlson
D. Gusfield
D. Gusfield
D. Gusfield
G. Drouin
J. Hein
J. Hein
J. Hein
J.C. Stephens
J.D. Wall
L. Frisse
M. Lajoie
N. El-Mabrouk
P. Fearnhead
R. Hudson
R. Hudson
S. Sawyer
S.R. Myers
T. Wiehe
The International HapMap Consortium
V. Bafna
V. Bafna
Y.S. Song
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/1998
Field of study

International audienceA time-frequency scaling transformation based on the matching pursuit (MP) method is developed for the phonocardiogram (PCG). The MP method decomposes a signal into a series of time-frequency atoms by using an iterative process. The modification of the time scale of the PCG can be performed without perceptible change in its spectral characteristics. It is also possible to modify the frequency scale without changing the temporal properties. The technique has been tested on 11 PCG's containing heart sounds and different murmurs. A scaling/inverse-scaling procedure was used for quantitative evaluation of the scaling performance. Both the spectrogram and a MP-based Wigner distribution were used for visual comparison in the time-frequency domain. The results showed that the technique is suitable and effective for the time-frequency scale transformation of both the transient property of the heart sounds and the more complex random property of the murmurs. It is also shown that the effectiveness of the method is strongly related to the optimization of the parameters used for the decomposition of the signals

Crossref

HAL-Inserm

HAL-Rennes 1

Fast Label Extraction in the CDAWG

Author: A Blumer
D Belazzougui
D Gusfield
J Sirén
L Gasieniec
LS Russo
M Crochemore
M Crochemore
M Crochemore
M Crochemore
M Raffinot
MA Bender
O Berkman
T Gagie
V Mäkinen
V Mäkinen
Publication venue
Publication date: 26/09/2017
Field of study

The compact directed acyclic word graph (CDAWG) of a string

T

of length

n

takes space proportional just to the number

e

of right extensions of the maximal repeats of

T

, and it is thus an appealing index for highly repetitive datasets, like collections of genomes from similar species, in which

e

grows significantly more slowly than

n

. We reduce from

O(m\log{\log{n}})

O(m)

the time needed to count the number of occurrences of a pattern of length

m

, using an existing data structure that takes an amount of space proportional to the size of the CDAWG. This implies a reduction from

O(m\log{\log{n}}+\mathtt{occ})

O(m+\mathtt{occ})

in the time needed to locate all the

\mathtt{occ}

occurrences of the pattern. We also reduce from

O(k\log{\log{n}})

O(k)

the time needed to read the

k

characters of the label of an edge of the suffix tree of

T

, and we reduce from

O(m\log{\log{n}})

O(m)

the time needed to compute the matching statistics between a query of length

m

and

T

, using an existing representation of the suffix tree based on the CDAWG. All such improvements derive from extracting the label of a vertex or of an arc of the CDAWG using a straight-line program induced by the reversed CDAWG.Comment: 16 pages, 1 figure. In proceedings of the 24th International Symposium on String Processing and Information Retrieval (SPIRE 2017). arXiv admin note: text overlap with arXiv:1705.0864

arXiv.org e-Print Archive

Crossref