19 research outputs found
Recommended from our members
The least weight subsequence problem
The least weight subsequence (LWS) problem is introduced, and is shown to be equivalent to the classic minimum path problem for directed graphs. A special case of the LWS problem is shown to be solvable in O(n log n) time generally and, for certain weight functions, in linear time. A number of applications are given, including an optimum paragraph formation problem and the problem of finding a minimum height B-tree, whose solutions realize improvement in asymptotic time complexity
Recommended from our members
A Linear-Time Algorithm for Concave One-Dimensional Dynamic Programming
The least weight subsequence problem is a special case of the one-dimensional dynamic programming problem where D[i] = E[i]. The modified edit distance problem, which arises in molecular biology. geology, and speech recognition, can be decomposed into 2n copies of the problem
Parallel algorithm for the matrix chain product problem
This paper considers the problem of finding an optimal order of the multiplication chain of matrices. All parallel algorithms known use the dynamic programming approach and run in a polylogarithmic time using, in the best case, n6/log6n processors. Our algorithm uses a different approach and reduces the problem to computing some recurrence on a tree. We show that this recurrence can be optimally solved which enables us to improve the parallel bound by a few factors. Our algorithm runs in O (log3n) time using n2/log3n processors on a CREW PRAM and O(log2n log log n) time using n2/(log2n log log n)processors on a CRCW PRAM. This algorithm solves also the problem of finding an optimal triangulation in a convex polygon. We show that for a monotone polygon this result can be even improved to get an O(log2n) time and n processor algorithm on a CREW PRAM
Recommended from our members
Speeding up dynamic programming with applications to molecular biology
Consider the problem of computing E[j] = mit:! {D[k] + w(k, j)}, j = 1, ... , n, O~k~]-l where w is a given weight function, D[D] is given and for every k = 1, ... , n, D[k] is easily computable from E[k]. This problem appears as a subproblem in dynamic programming solutions to various problems. Obviously, it can be solved in time O( n2 ), and for a general weight function no better algorithm is possible. We consider two dual cases that arise in applications: In the concave case, the weight function satisfies the quadrangle inequality: w(k,j) + w(l,j') ~ w(l,j) +w(k,j'), for all k ~ 1 ~ j ~ j'. In the convex case, the weight function satisfies the inverse quadrangle inequality. In both cases we show how to use the assumed property of w to derive an O( n log n) algorithm. Even better, linear-time algorithms are obtained if w satisfies the following additional closest zero property: for every two integers 1 and k, 1 < k, and real number a, the smallest zero of f(x) = w(l,x) - w(k,x) - a which is larger than 1 can be found in constant time. Surprisingly, the two algorithms are also dual in the following sense: Both work in stages. In the j-th stage they compute Elj]. They maintain a set of candidates which satisfies the property that Elj] depends only on D[k] + w(k, j) for k's in the set. Moreover, each algorithm discards candidates from the set, and discarded candidates never rejoin the set. To be able to maintain such a set of candidates efficiently one uses the following "dual" data structures: a queue in the concave case and a stack in the convex case. The two algorithms speed up several dynamic programming routines that solve as a subproblem the problem above. The speed-up is from O(n3 ) to O(n2Iogn) or O(n2 ). Applications include algorithms for comparing DNA sequences, algorithms for determining the secondary structure of RNA, and algorithms used in speech recognition and geology. One typical problem is the following: Given the cost of substituting any pair of symbols and a convex cost function g for gaps (where g(r) is the cost of a gap of size r), compute the modified edit distance between the two given sequences
Faster optimal univariate microgaggregation
Microaggregation is a method to coarsen a dataset, by optimally clustering
data points in groups of at least points, thereby providing a -anonymity
type disclosure guarantee for each point in the dataset. Previous algorithms
for univariate microaggregation had a time complexity. By rephrasing
microaggregation as an instance of the concave least weight subsequence
problem, in this work we provide improved algorithms that provide an optimal
univariate microaggregation on sorted data in time and space. We further
show that our algorithms work not only for sum of squares cost functions, as
typically considered, but seamlessly extend to many other cost functions used
for univariate microaggregation tasks. In experiments we show that the
presented algorithms lead to real world performance improvements
Distribution-Aware Compressed Full-Text Indexes
Peer reviewe
Distribution-aware compressed full-text indexes
In this paper we address the problem of building a compressed self-index that, given a distribution for the pattern queries and a bound on the space occupancy, minimizes the expected query time within that index space bound. We solve this problem by exploiting a reduction to the problem of finding a minimum weight K-link path in a properly designed Directed Acyclic Graph. Interestingly enough, our solution can be used with any compressed index based on the Burrows-Wheeler transform. Our experiments compare this optimal strategy with several other known approaches, showing its effectiveness in practice