859 research outputs found

    A Minimal Periods Algorithm with Applications

    Full text link
    Kosaraju in ``Computation of squares in a string'' briefly described a linear-time algorithm for computing the minimal squares starting at each position in a word. Using the same construction of suffix trees, we generalize his result and describe in detail how to compute in O(k|w|)-time the minimal k-th power, with period of length larger than s, starting at each position in a word w for arbitrary exponent k2k\geq2 and integer s0s\geq0. We provide the complete proof of correctness of the algorithm, which is somehow not completely clear in Kosaraju's original paper. The algorithm can be used as a sub-routine to detect certain types of pseudo-patterns in words, which is our original intention to study the generalization.Comment: 14 page

    The stochastic matching problem

    Get PDF
    The matching problem plays a basic role in combinatorial optimization and in statistical mechanics. In its stochastic variants, optimization decisions have to be taken given only some probabilistic information about the instance. While the deterministic case can be solved in polynomial time, stochastic variants are worst-case intractable. We propose an efficient method to solve stochastic matching problems which combines some features of the survey propagation equations and of the cavity method. We test it on random bipartite graphs, for which we analyze the phase diagram and compare the results with exact bounds. Our approach is shown numerically to be effective on the full range of parameters, and to outperform state-of-the-art methods. Finally we discuss how the method can be generalized to other problems of optimization under uncertainty.Comment: Published version has very minor change

    Wave Energy: a Pacific Perspective

    Get PDF
    This is the author's peer-reviewed final manuscript, as accepted by the publisher. The published article is copyrighted by The Royal Society and can be found at: http://rsta.royalsocietypublishing.org/.This paper illustrates the status of wave energy development in Pacific Rim countries by characterizing the available resource and introducing the region‟s current and potential future leaders in wave energy converter development. It also describes the existing licensing and permitting process as well as potential environmental concerns. Capabilities of Pacific Ocean testing facilities are described in addition to the region‟s vision of the future of wave energy

    On the maximal sum of exponents of runs in a string

    Get PDF
    A run is an inclusion maximal occurrence in a string (as a subinterval) of a repetition vv with a period pp such that 2pv2p \le |v|. The exponent of a run is defined as v/p|v|/p and is 2\ge 2. We show new bounds on the maximal sum of exponents of runs in a string of length nn. Our upper bound of 4.1n4.1n is better than the best previously known proven bound of 5.6n5.6n by Crochemore & Ilie (2008). The lower bound of 2.035n2.035n, obtained using a family of binary words, contradicts the conjecture of Kolpakov & Kucherov (1999) that the maximal sum of exponents of runs in a string of length nn is smaller than 2n2nComment: 7 pages, 1 figur

    Scheduling Jobs in Flowshops with the Introduction of Additional Machines in the Future

    Get PDF
    This is the author's peer-reviewed final manuscript, as accepted by the publisher. The published article is copyrighted by Elsevier and can be found at: http://www.journals.elsevier.com/expert-systems-with-applications/.The problem of scheduling jobs to minimize total weighted tardiness in flowshops,\ud with the possibility of evolving into hybrid flowshops in the future, is investigated in\ud this paper. As this research is guided by a real problem in industry, the flowshop\ud considered has considerable flexibility, which stimulated the development of an\ud innovative methodology for this research. Each stage of the flowshop currently has\ud one or several identical machines. However, the manufacturing company is planning\ud to introduce additional machines with different capabilities in different stages in the\ud near future. Thus, the algorithm proposed and developed for the problem is not only\ud capable of solving the current flow line configuration but also the potential new\ud configurations that may result in the future. A meta-heuristic search algorithm based\ud on Tabu search is developed to solve this NP-hard, industry-guided problem. Six\ud different initial solution finding mechanisms are proposed. A carefully planned\ud nested split-plot design is performed to test the significance of different factors and\ud their impact on the performance of the different algorithms. To the best of our\ud knowledge, this research is the first of its kind that attempts to solve an industry-guided\ud problem with the concern for future developments

    Duel and sweep algorithm for order-preserving pattern matching

    Full text link
    Given a text TT and a pattern PP over alphabet Σ\Sigma, the classic exact matching problem searches for all occurrences of pattern PP in text TT. Unlike exact matching problem, order-preserving pattern matching (OPPM) considers the relative order of elements, rather than their real values. In this paper, we propose an efficient algorithm for OPPM problem using the "duel-and-sweep" paradigm. Our algorithm runs in O(n+mlogm)O(n + m\log m) time in general and O(n+m)O(n + m) time under an assumption that the characters in a string can be sorted in linear time with respect to the string size. We also perform experiments and show that our algorithm is faster that KMP-based algorithm. Last, we introduce the two-dimensional order preserved pattern matching and give a duel and sweep algorithm that runs in O(n2)O(n^2) time for duel stage and O(n2m)O(n^2 m) time for sweeping time with O(m3)O(m^3) preprocessing time.Comment: 13 pages, 5 figure

    Bethe Ansatz in the Bernoulli Matching Model of Random Sequence Alignment

    Full text link
    For the Bernoulli Matching model of sequence alignment problem we apply the Bethe ansatz technique via an exact mapping to the 5--vertex model on a square lattice. Considering the terrace--like representation of the sequence alignment problem, we reproduce by the Bethe ansatz the results for the averaged length of the Longest Common Subsequence in Bernoulli approximation. In addition, we compute the average number of nucleation centers of the terraces.Comment: 14 pages, 5 figures (some points are clarified

    Two algorithms for the student-project allocation problem

    Get PDF
    We study the Student-Project Allocation problem (SPA), a generalisation of the classical Hospitals / Residents problem (HR). An instance of SPA involves a set of students, projects and lecturers. Each project is offered by a unique lecturer, and both projects and lecturers have capacity constraints. Students have preferences over projects, whilst lecturers have preferences over students. We present two optimal linear-time algorithms for allocating students to projects, subject to the preference and capacity constraints. In particular, each algorithm finds a stable matching of students to projects. Here, the concept of stability generalises the stability definition in the HR context. The stable matching produced by the first algorithm is simultaneously best-possible for all students, whilst the one produced by the second algorithm is simultaneously best-possible for all lecturers. We also prove some structural results concerning the set of stable matchings in a given instance of SPA. The SPA problem model that we consider is very general and has applications to a range of different contexts besides student-project allocation

    Time-frequency scaling transformation of the phonocardiogram based of the matching pursuit method.

    No full text
    International audienceA time-frequency scaling transformation based on the matching pursuit (MP) method is developed for the phonocardiogram (PCG). The MP method decomposes a signal into a series of time-frequency atoms by using an iterative process. The modification of the time scale of the PCG can be performed without perceptible change in its spectral characteristics. It is also possible to modify the frequency scale without changing the temporal properties. The technique has been tested on 11 PCG's containing heart sounds and different murmurs. A scaling/inverse-scaling procedure was used for quantitative evaluation of the scaling performance. Both the spectrogram and a MP-based Wigner distribution were used for visual comparison in the time-frequency domain. The results showed that the technique is suitable and effective for the time-frequency scale transformation of both the transient property of the heart sounds and the more complex random property of the murmurs. It is also shown that the effectiveness of the method is strongly related to the optimization of the parameters used for the decomposition of the signals

    Fast Label Extraction in the CDAWG

    Full text link
    The compact directed acyclic word graph (CDAWG) of a string TT of length nn takes space proportional just to the number ee of right extensions of the maximal repeats of TT, and it is thus an appealing index for highly repetitive datasets, like collections of genomes from similar species, in which ee grows significantly more slowly than nn. We reduce from O(mloglogn)O(m\log{\log{n}}) to O(m)O(m) the time needed to count the number of occurrences of a pattern of length mm, using an existing data structure that takes an amount of space proportional to the size of the CDAWG. This implies a reduction from O(mloglogn+occ)O(m\log{\log{n}}+\mathtt{occ}) to O(m+occ)O(m+\mathtt{occ}) in the time needed to locate all the occ\mathtt{occ} occurrences of the pattern. We also reduce from O(kloglogn)O(k\log{\log{n}}) to O(k)O(k) the time needed to read the kk characters of the label of an edge of the suffix tree of TT, and we reduce from O(mloglogn)O(m\log{\log{n}}) to O(m)O(m) the time needed to compute the matching statistics between a query of length mm and TT, using an existing representation of the suffix tree based on the CDAWG. All such improvements derive from extracting the label of a vertex or of an arc of the CDAWG using a straight-line program induced by the reversed CDAWG.Comment: 16 pages, 1 figure. In proceedings of the 24th International Symposium on String Processing and Information Retrieval (SPIRE 2017). arXiv admin note: text overlap with arXiv:1705.0864
    corecore