221,998 research outputs found
Fully Online Grammar Compression in Constant Space
We present novel variants of fully online LCA (FOLCA), a fully online grammar
compression that builds a straight line program (SLP) and directly encodes it
into a succinct representation in an online manner. FOLCA enables a direct
encoding of an SLP into a succinct representation that is asymptotically
equivalent to an information theoretic lower bound for representing an SLP
(Maruyama et al., SPIRE'13). The compression of FOLCA takes linear time
proportional to the length of an input text and its working space depends only
on the size of the SLP, which enables us to apply FOLCA to large-scale
repetitive texts. Recent repetitive texts, however, include some noise. For
example, current sequencing technology has significant error rates, which
embeds noise into genome sequences. For such noisy repetitive texts, FOLCA
working in the SLP size consumes a large amount of memory. We present two
variants of FOLCA working in constant space by leveraging the idea behind
stream mining techniques. Experiments using 100 human genomes corresponding to
about 300GB from the 1000 human genomes project revealed the applicability of
our method to large-scale, noisy repetitive texts.Comment: This is an extended version of a proceeding accepted to Data
Compression Conference (DCC), 201
A computational evaluation of constructive and improvement heuristics for the blocking flow shop to minimize total flowtime
This paper focuses on the blocking flow shop scheduling problem with the objective of total flowtime minimisation. This problem assumes that there are no buffers between machines and, due to its application to many manufacturing sectors, it is receiving a growing attention by researchers during the last years. Since the problem is NP-hard, a large number of heuristics have been proposed to provide good solutions with reasonable computational times. In this paper, we conduct a comprehensive evaluation of the available heuristics for the problem and for related problems, resulting in the implementation and testing of a total of 35 heuristics. Furthermore, we propose an efficient constructive heuristic which successfully combines a pool of partial sequences in parallel, using a beam-search-based approach. The computational experiments show the excellent performance of the proposed heuristic as compared to the best-so-far algorithms for the problem, both in terms of quality of the solutions and of computational requirements. In fact, despite being a relative fast constructive heuristic, new best upper bounds have been found for more than 27% of Taillard’s instances.Ministerio de Ciencia e Innovación DPI2013-44461-P/DP
Online Pattern Matching for String Edit Distance with Moves
Edit distance with moves (EDM) is a string-to-string distance measure that
includes substring moves in addition to ordinal editing operations to turn one
string to the other. Although optimizing EDM is intractable, it has many
applications especially in error detections. Edit sensitive parsing (ESP) is an
efficient parsing algorithm that guarantees an upper bound of parsing
discrepancies between different appearances of the same substrings in a string.
ESP can be used for computing an approximate EDM as the L1 distance between
characteristic vectors built by node labels in parsing trees. However, ESP is
not applicable to a streaming text data where a whole text is unknown in
advance. We present an online ESP (OESP) that enables an online pattern
matching for EDM. OESP builds a parse tree for a streaming text and computes
the L1 distance between characteristic vectors in an online manner. For the
space-efficient computation of EDM, OESP directly encodes the parse tree into a
succinct representation by leveraging the idea behind recent results of a
dynamic succinct tree. We experimentally test OESP on the ability to compute
EDM in an online manner on benchmark datasets, and we show OESP's efficiency.Comment: This paper has been accepted to the 21st edition of the International
Symposium on String Processing and Information Retrieval (SPIRE2014
New efficient constructive heuristics for the hybrid flowshop to minimise makespan: A computational evaluation of heuristics
This paper addresses the hybrid flow shop scheduling problem to minimise makespan, a well-known scheduling problem for which many constructive heuristics have been proposed in the literature. Nevertheless, the state of the art is not clear due to partial or non homogeneous comparisons. In this paper, we review these heuristics and perform a comprehensive computational evaluation to determine which are the most efficient ones. A total of 20 heuristics are implemented and compared in this study. In addition, we propose four new heuristics for the problem. Firstly, two memory-based constructive heuristics are proposed, where a sequence is constructed by inserting jobs one by one in a partial sequence. The most promising insertions tested are kept in a list. However, in contrast to the Tabu search, these insertions are repeated in future iterations instead of forbidding them. Secondly, we propose two constructive heuristics based on Johnson’s algorithm for the permutation flowshop scheduling problem. The computational results carried out on an extensive testbed show that the new proposals outperform the existing heuristics.Ministerio de Ciencia e Innovación DPI2016-80750-
A beam-search-based constructive heuristic for the PFSP to minimise total flowtime
In this paper we present a beam-search-based constructive heuristic to solve the
permutation flowshop scheduling problem with total flowtime minimisation as objective. This well-known problem is NP-hard, and several heuristics have been developed
in the literature. The proposed algorithm is inspired in the logic of the beam search,
although it remains a fast constructive heuristic.
The results obtained by the proposed algorithm outperform those obtained by
other constructive heuristics in the literature for the problem, thus modifying substantially the state-of-the-art of efficient approximate procedures for the problem. In
addition, the proposed algorithm even outperforms two of the best metaheuristics for
many instances of the problem, using much lesser computation effort. The excellent
performance of the proposal is also proved by the fact that the new heuristic found
new best upper bounds for 35 of the 120 instances in Taillard’s benchmark.Ministerio de Ciencia e Innovación DPI2013-44461-PMinisterio de Ciencia e Innovación DPI2016-80750-
A General, Sound and Efficient Natural Language Parsing Algorithm based on Syntactic Constraints Propagation
This paper presents a new context-free parsing algorithm based on a bidirectional
strictly horizontal strategy which incorporates strong top–down predictions (deriva-
tions and adjacencies). From a functional point of view, the parser is able to propagate
syntactic constraints reducing parsing ambiguity. From a computational perspective,
the algorithm includes different techniques aimed at the improvement of the manipu-
lation and representation of the structures used
- …