221,998 research outputs found

    Fully Online Grammar Compression in Constant Space

    Full text link
    We present novel variants of fully online LCA (FOLCA), a fully online grammar compression that builds a straight line program (SLP) and directly encodes it into a succinct representation in an online manner. FOLCA enables a direct encoding of an SLP into a succinct representation that is asymptotically equivalent to an information theoretic lower bound for representing an SLP (Maruyama et al., SPIRE'13). The compression of FOLCA takes linear time proportional to the length of an input text and its working space depends only on the size of the SLP, which enables us to apply FOLCA to large-scale repetitive texts. Recent repetitive texts, however, include some noise. For example, current sequencing technology has significant error rates, which embeds noise into genome sequences. For such noisy repetitive texts, FOLCA working in the SLP size consumes a large amount of memory. We present two variants of FOLCA working in constant space by leveraging the idea behind stream mining techniques. Experiments using 100 human genomes corresponding to about 300GB from the 1000 human genomes project revealed the applicability of our method to large-scale, noisy repetitive texts.Comment: This is an extended version of a proceeding accepted to Data Compression Conference (DCC), 201

    A computational evaluation of constructive and improvement heuristics for the blocking flow shop to minimize total flowtime

    Get PDF
    This paper focuses on the blocking flow shop scheduling problem with the objective of total flowtime minimisation. This problem assumes that there are no buffers between machines and, due to its application to many manufacturing sectors, it is receiving a growing attention by researchers during the last years. Since the problem is NP-hard, a large number of heuristics have been proposed to provide good solutions with reasonable computational times. In this paper, we conduct a comprehensive evaluation of the available heuristics for the problem and for related problems, resulting in the implementation and testing of a total of 35 heuristics. Furthermore, we propose an efficient constructive heuristic which successfully combines a pool of partial sequences in parallel, using a beam-search-based approach. The computational experiments show the excellent performance of the proposed heuristic as compared to the best-so-far algorithms for the problem, both in terms of quality of the solutions and of computational requirements. In fact, despite being a relative fast constructive heuristic, new best upper bounds have been found for more than 27% of Taillard’s instances.Ministerio de Ciencia e Innovación DPI2013-44461-P/DP

    Online Pattern Matching for String Edit Distance with Moves

    Full text link
    Edit distance with moves (EDM) is a string-to-string distance measure that includes substring moves in addition to ordinal editing operations to turn one string to the other. Although optimizing EDM is intractable, it has many applications especially in error detections. Edit sensitive parsing (ESP) is an efficient parsing algorithm that guarantees an upper bound of parsing discrepancies between different appearances of the same substrings in a string. ESP can be used for computing an approximate EDM as the L1 distance between characteristic vectors built by node labels in parsing trees. However, ESP is not applicable to a streaming text data where a whole text is unknown in advance. We present an online ESP (OESP) that enables an online pattern matching for EDM. OESP builds a parse tree for a streaming text and computes the L1 distance between characteristic vectors in an online manner. For the space-efficient computation of EDM, OESP directly encodes the parse tree into a succinct representation by leveraging the idea behind recent results of a dynamic succinct tree. We experimentally test OESP on the ability to compute EDM in an online manner on benchmark datasets, and we show OESP's efficiency.Comment: This paper has been accepted to the 21st edition of the International Symposium on String Processing and Information Retrieval (SPIRE2014

    New efficient constructive heuristics for the hybrid flowshop to minimise makespan: A computational evaluation of heuristics

    Get PDF
    This paper addresses the hybrid flow shop scheduling problem to minimise makespan, a well-known scheduling problem for which many constructive heuristics have been proposed in the literature. Nevertheless, the state of the art is not clear due to partial or non homogeneous comparisons. In this paper, we review these heuristics and perform a comprehensive computational evaluation to determine which are the most efficient ones. A total of 20 heuristics are implemented and compared in this study. In addition, we propose four new heuristics for the problem. Firstly, two memory-based constructive heuristics are proposed, where a sequence is constructed by inserting jobs one by one in a partial sequence. The most promising insertions tested are kept in a list. However, in contrast to the Tabu search, these insertions are repeated in future iterations instead of forbidding them. Secondly, we propose two constructive heuristics based on Johnson’s algorithm for the permutation flowshop scheduling problem. The computational results carried out on an extensive testbed show that the new proposals outperform the existing heuristics.Ministerio de Ciencia e Innovación DPI2016-80750-

    A beam-search-based constructive heuristic for the PFSP to minimise total flowtime

    Get PDF
    In this paper we present a beam-search-based constructive heuristic to solve the permutation flowshop scheduling problem with total flowtime minimisation as objective. This well-known problem is NP-hard, and several heuristics have been developed in the literature. The proposed algorithm is inspired in the logic of the beam search, although it remains a fast constructive heuristic. The results obtained by the proposed algorithm outperform those obtained by other constructive heuristics in the literature for the problem, thus modifying substantially the state-of-the-art of efficient approximate procedures for the problem. In addition, the proposed algorithm even outperforms two of the best metaheuristics for many instances of the problem, using much lesser computation effort. The excellent performance of the proposal is also proved by the fact that the new heuristic found new best upper bounds for 35 of the 120 instances in Taillard’s benchmark.Ministerio de Ciencia e Innovación DPI2013-44461-PMinisterio de Ciencia e Innovación DPI2016-80750-

    A General, Sound and Efficient Natural Language Parsing Algorithm based on Syntactic Constraints Propagation

    Get PDF
    This paper presents a new context-free parsing algorithm based on a bidirectional strictly horizontal strategy which incorporates strong top–down predictions (deriva- tions and adjacencies). From a functional point of view, the parser is able to propagate syntactic constraints reducing parsing ambiguity. From a computational perspective, the algorithm includes different techniques aimed at the improvement of the manipu- lation and representation of the structures used
    • …
    corecore