857 research outputs found
Efficient pebbling for list traversal synopses
We show how to support efficient back traversal in a unidirectional list,
using small memory and with essentially no slowdown in forward steps. Using
memory for a list of size , the 'th back-step from the
farthest point reached so far takes time in the worst case, while
the overhead per forward step is at most for arbitrary small
constant . An arbitrary sequence of forward and back steps is
allowed. A full trade-off between memory usage and time per back-step is
presented: vs. and vice versa. Our algorithms are based on a
novel pebbling technique which moves pebbles on a virtual binary, or -ary,
tree that can only be traversed in a pre-order fashion. The compact data
structures used by the pebbling algorithms, called list traversal synopses,
extend to general directed graphs, and have other interesting applications,
including memory efficient hash-chain implementation. Perhaps the most
surprising application is in showing that for any program, arbitrary rollback
steps can be efficiently supported with small overhead in memory, and marginal
overhead in its ordinary execution. More concretely: Let be a program that
runs for at most steps, using memory of size . Then, at the cost of
recording the input used by the program, and increasing the memory by a factor
of to , the program can be extended to support an
arbitrary sequence of forward execution and rollback steps: the 'th rollback
step takes time in the worst case, while forward steps take O(1)
time in the worst case, and amortized time per step.Comment: 27 page
Polyhedral Combinatorics of UPGMA Cones
Distance-based methods such as UPGMA (Unweighted Pair Group Method with
Arithmetic Mean) continue to play a significant role in phylogenetic research.
We use polyhedral combinatorics to analyze the natural subdivision of the
positive orthant induced by classifying the input vectors according to tree
topologies returned by the algorithm. The partition lattice informs the study
of UPGMA trees. We give a closed form for the extreme rays of UPGMA cones on n
taxa, and compute the normalized volumes of the UPGMA cones for small n.
Keywords: phylogenetic trees, polyhedral combinatorics, partition lattic
Time-Optimal Tree Computations on Sparse Meshes
The main goal of this work is to fathom the suitability of the mesh with multiple broadcasting architecture (MMB) for some tree-related computations. We view our contribution at two levels: on the one hand, we exhibit time lower bounds for a number of tree-related problems on the MMB. On the other hand, we show that these lower bounds are tight by exhibiting time-optimal tree algorithms on the MMB. Specifically, we show that the task of encoding and/or decoding n-node binary and ordered trees cannot be solved faster than Ī©(log n) time even if the MMB has an infinite number of processors. We then go on to show that this lower bound is tight. We also show that the task of reconstructing n-node binary trees and ordered trees from their traversais can be performed in O(1) time on the same architecture. Our algorithms rely on novel time-optimal algorithms on sequences of parentheses that we also develop
Fast Parallel Algorithms for Basic Problems
Parallel processing is one of the most active research areas these days. We are interested in one aspect of parallel processing, i.e. the design and analysis of parallel algorithms. Here, we focus on non-numerical parallel algorithms for basic combinatorial problems, such as data structures, selection, searching, merging and sorting. The purposes of studying these types of problems are to obtain basic building blocks which will be useful in solving complex problems, and to develop fundamental algorithmic techniques.
In this thesis, we study the following problems: priority queues, multiple search and multiple selection, and reconstruction of a binary tree from its traversals. The research on priority queue was motivated by its various applications. The purpose of studying multiple search and multiple selection is to explore the relationships between four of the most fundamental problems in algorithm design, that is, selection, searching, merging and sorting; while our parallel solutions can be used as subroutines in algorithms for other problems. The research on the last problem, reconstruction of a binary tree from its traversals, was stimulated by a challenge proposed in a recent paper by Berkman et al. ( Highly Parallelizable Problems, STOC 89) to design doubly logarithmic time optimal parallel algorithms because a remarkably small number of such parallel algorithms exist
Time-Optimal Algorithms on Meshes With Multiple Broadcasting
The mesh-connected computer architecture has emerged as a natural choice for solving a large number of computational tasks in image processing, computational geometry, and computer vision. However, due to its large communication diameter, the mesh tends to be slow when it comes to handling data transfer operations over long distances. In an attempt to overcome this problem, mesh-connected computers have recently been augmented by the addition of various types of bus systems. One such system known as the mesh with multiple broadcasting involves enhancing the mesh architecture by the addition of row and column buses. The mesh with multiple broadcasting has proven to be feasible to implement in VLSI, and is used in the DAP family of computers. In recent years, efficient algorithms to solve a number of computational problems on meshes with multiple broadcasting have been proposed in the literature.
The problems considered in this thesis are semigroup computations, sorting, multiple search, various convexity-related problems, and some tree problems. Based on the size of the input data for the problem under consideration, existing results can be broadly classified into sparse and dense. Specifically, for a given ān x ān mesh with multiple broadcasting, we refer to problems involving ) items as sparse, while the case Ā£ O(n) will be referred to as dense. Finally, the case corresponding to 2 ā¤ m ā¤ n is be termed general. The motivation behind the current work is twofold. First, time-optimal solutions are proposed for the problems listed above. Secondly, an attempt is made to remove the artificial limitation of problems studied to sparse and dense cases.
To establish the time-optimality of the algorithms presented in this work, we use some existing lower bound techniques along with new ones that we develop. We solve the semigroup computation problem for the general case and present a novel lower bound argument. We solve the multiple search problem in the general case and present some surprising applications to computational geometry. In the case of sorting, the general case is defined to be slightly different. For the specified range of the size of input, we present a time and VLSI-optimal algorithm. We also present time lower bound results and matching algorithms for a number of convexity related and tree problems in the sparse case
The Sketch of a Polymorphic Symphony
In previous work, we have introduced functional strategies, that is,
first-class generic functions that can traverse into terms of any type while
mixing uniform and type-specific behaviour. In the present paper, we give a
detailed description of one particular Haskell-based model of functional
strategies. This model is characterised as follows. Firstly, we employ
first-class polymorphism as a form of second-order polymorphism as for the mere
types of functional strategies. Secondly, we use an encoding scheme of run-time
type case for mixing uniform and type-specific behaviour. Thirdly, we base all
traversal on a fundamental combinator for folding over constructor
applications.
Using this model, we capture common strategic traversal schemes in a highly
parameterised style. We study two original forms of parameterisation. Firstly,
we design parameters for the specific control-flow, data-flow and traversal
characteristics of more concrete traversal schemes. Secondly, we use
overloading to postpone commitment to a specific type scheme of traversal. The
resulting portfolio of traversal schemes can be regarded as a challenging
benchmark for setups for typed generic programming.
The way we develop the model and the suite of traversal schemes, it becomes
clear that parameterised + typed strategic programming is best viewed as a
potent combination of certain bits of parametric, intensional, polytypic, and
ad-hoc polymorphism
Evaluation of an efficient etack-RLE clustering concept for dynamically adaptive grids
This is the author accepted manuscript. The final version is available from the Society for Industrial and Applied Mathematics via the DOI in this record.Abstract.
One approach to tackle the challenge of efficient implementations for parallel PDE simulations
on dynamically changing grids is the usage of space-filling curves (SFC). While SFC algorithms
possess advantageous properties such as low memory requirements and close-to-optimal partitioning
approaches with linear complexity, they require efficient communication strategies for keeping and
utilizing the connectivity information, in particular for dynamically changing grids. Our approach
is to use a sparse communication graph to store the connectivity information and to transfer data
block-wise. This permits efficient generation of multiple partitions per memory context (denoted
by clustering) which - in combination with a run-length encoding (RLE) - directly leads to elegant
solutions for shared, distributed and hybrid parallelization and allows cluster-based optimizations.
While previous work focused on specific aspects, we present in this paper an overall compact
summary of the stack-RLE clustering approach completed by aspects on the vertex-based communication
that ease up understanding the approach. The central contribution of this work is the proof
of suitability of the stack-RLE clustering approach for an efficient realization of different, relevant
building blocks of Scientific Computing methodology and real-life CSE applications: We show 95%
strong scalability for small-scale scalability benchmarks on 512 cores and weak scalability of over 90%
on 8192 cores for finite-volume solvers and changing grid structure in every time step; optimizations
of simulation data backends by writer tasks; comparisons of analytical benchmarks to analyze the
adaptivity criteria; and a Tsunami simulation as a representative real-world showcase of a wave propagation
for our approach which reduces the overall workload by 95% for parallel fully-adaptive mesh
refinement and, based on a comparison with SFC-ordered regular grid cells, reduces the computation
time by a factor of 7.6 with improved results and a factor of 62.2 with results of similar accuracy of
buoy station dataThis work was partly supported by the German Research
Foundation (DFG) as part of the Transregional Collaborative Research Centre āInvasive
Computingā (SFB/TR 89)
- ā¦