21 research outputs found
An optimal EREW parallel algorithm for parenthesis matching
published_or_final_versio
Implementation of parallel algorithm for run of k-local tree automata
Tato práce se zabývá k-lokálními deterministickými konečnými stromovými automaty (DKSA), které hrají důležitou roli při hledání vzorů ve stromových strukturách. Existuje pracovně optimální paralelní algoritmus pro běh k-lokálních DKSA na výpočetním modelu EREW PRAM. Tento algoritmus bude implementován, experimentálně změřen a porovnán se sekvenčním algoritmem v této práci.This thesis deals with k-local deterministic finite tree automata (DFTA) which are important for tree pattern matching. There exists a work-optimal parallel algorithm for a run of k-local DFTA on EREW PRAM. This algorithm will be implemented, experimentally measured and compared with the sequential algorithm in this thesis
Recommended from our members
Efficient Linked List Ranking Algorithms and Parentheses Matching as a New Strategy for Parallel Algorithm Design
The goal of a parallel algorithm is to solve a single problem using multiple processors working together and to do so in an efficient manner. In this regard, there is a need to categorize strategies in order to solve broad classes of problems with similar structures and requirements. In this dissertation, two parallel algorithm design strategies are considered: linked list ranking and parentheses matching
Recommended from our members
Efficient Parallel Algorithms and Data Structures Related to Trees
The main contribution of this dissertation proposes a new paradigm, called the parentheses matching paradigm. It claims that this paradigm is well suited for designing efficient parallel algorithms for a broad class of nonnumeric problems. To demonstrate its applicability, we present three cost-optimal parallel algorithms for breadth-first traversal of general trees, sorting a special class of integers, and coloring an interval graph with the minimum number of colors
Optimal parallel algorithms for rectilinear link-distance problems
We provide optimal parallel solutions to several link-distance problems set in trapezoided rectilinear polygons. All our main parallel algorithms are deterministic and designed to run on the exclusive read exclusive write parallel random access machine (EREW PRAM). Let P be a trapezoided rectilinear simple polygon with n vertices. In O(log n) time using O(n/log n) processors we can optimally compute: 1. Minimum réctilinear link paths, or shortest paths in the L1 metric from any point in P to all vertices of P. 2. Minimum rectilinear link paths from any segment inside P to all vertices of P. 3. The rectilinear window (histogram) partition of P. 4. Both covering radii and vertex intervals for any diagonal of P. 5. A data structure to support rectilinear link-distance queries between any two points in P (queries can be answered optimally in O(log n) time by uniprocessor). Our solution to 5 is based on a new linear-time sequential algorithm for this problem which is also provided here. This improves on the previously best-known sequential algorithm for this problem which used O(n log n) time and space.5 We develop techniques for solving link-distance problems in parallel which are expected to find applications in the design of other parallel computational geometry algorithms. We employ these parallel techniques, for example, to compute (on a CREW PRAM) optimally the link diameter, the link center, and the central diagonal of a rectilinear polygon
Parallel Parsing in a Multiprocessor Environment
Parsing in a multiprocessor environment is considered. Two models for asynchronous bottom-up parallel parsing are presented. A method for estimating speedup in asynchronous bottom-up parallel parsing is developed, and it is used to estimate speedup obtainable by bottom-up parallel parsing of Pascal-like languages. It is found that bottom-up parallel parsing algorithms can attain a maximum speedup of 0 (L1/2) with (L1/2) processors, where L is the number of tokens in the string being parsed. Hence, bottom-up parallel parsing technique does not yield good speedup. A new parsing technique is proposed for parsing a class of block-structured languages. The novelty of the technique is that it is inherently parallel. By applying this new technique, a string of L tokens can be parsed in O (log L) time with (L /log L) processors. The parsing algorithm uses a parenthesis-matching algorithm developed here. The parenthesis-matching algorithm can find matching of a sequence of parentheses in O (log L) time with (L /log L) processors. Thus, the new parsing algorithm is cost optimal
Adaptive Massively Parallel Constant-Round Tree Contraction
Miller and Reif's FOCS'85 classic and fundamental tree contraction algorithm
is a broadly applicable technique for the parallel solution of a large number
of tree problems. Additionally it is also used as an algorithmic design
technique for a large number of parallel graph algorithms. In all previously
explored models of computation, however, tree contractions have only been
achieved in rounds of parallel run time. In this work, we not
only introduce a generalized tree contraction method but also show it can be
computed highly efficiently in rounds in the Adaptive
Massively Parallel Computing (AMPC) setting, where each machine has
local memory for some . AMPC is a practical
extension of Massively Parallel Computing (MPC) which utilizes distributed hash
tables. In general, MPC is an abstract model for MapReduce, Hadoop, Spark, and
Flume which are currently widely used across industry and has been studied
extensively in the theory community in recent years. Last but not least, we
show that our results extend to multiple problems on trees, including but not
limited to maximum and maximal matching, maximum and maximal independent set,
tree isomorphism testing, and more.Comment: 35 pages, 3 figures, to be published in Innovations in Theoretical
Computer Science (ITCS
Parallel Algorithm for Finding All Minimal Maximum Subsequences via Random-walk Theory
A maximum contiguous subsequence of a real-valued sequence is a contiguous subsequence with the maximum cumulative sum. A minimal maximum contiguous subsequence is a minimal contiguous subsequence among all maximum ones of the sequence. Time- and space-efficient algorithms for finding the single or multiple contiguous subsequences of a real-valued sequence with large cumulative sums, in addition to its combinatorial appeal, have major applications such as in bioinformatics, pattern matching, and data mining. We have designed and implemented a domain-decomposed parallel algorithm on cluster systems with Message Passing Interface that finds all minimal maximum subsequences of a random sample sequence from a normal distribution with negative mean. We find the structural decomposition of the sequence with overlapping common subsequences for adjacent processors that allow hosting processors to compute the minimal maximum subsequences of the sequence independently. Our study employs the theory of random walk to derive an approximate probabilistic length bound for common subsequences in an appropriate probabilistic setting, which is incorporated in the algorithm to facilitate the concurrent computation of all minimal maximum subsequences in hosting processors. We also present an empirical study of the speedup and efficiency achieved by the parallel algorithm with synthetic random data.Computer Scienc
Scheduling and reconfiguration of interconnection network switches
Interconnection networks are important parts of modern computing systems, facilitating communication between a system\u27s components. Switches connecting various nodes of an interconnection network serve to move data in the network. The switch\u27s delay and throughput impact the overall performance of the network and thus the system. Scheduling efficient movement of data through a switch and configuring the switch to realize a schedule are the main themes of this research. We consider various interconnection network switches including (i) crossbar-based switches, (ii) circuit-switched tree switches, and (iii) fat-tree switches. For crossbar-based input-queued switches, a recent result established that logarithmic packet delay is possible. However, this result assumes that packet transmission time through the switch is no less than schedule-generation time. We prove that without this assumption (as is the case in practice) packet delay becomes linear. We also report results of simulations that bear out our result for practical switch sizes and indicate that a fast scheduling algorithm reduces not only packet delay but also buffer size. We also propose a fast mesh-of-trees based distributed switch scheduling (maximal-matching based) algorithm that has polylog complexity. A circuit-switched tree (CST) can serve as an interconnect structure for various computing architectures and models such as the self-reconfigurable gate array and the reconfigurable mesh. A CST is a tree structure with source and destination processing elements as leaves and switches as internal nodes. We design several scheduling and configuration algorithms that distributedly partition a given set of communications into non-conflicting subsets and then establish switch settings and paths on the CST corresponding to the communications. A fat-tree is another widely used interconnection structure in many of today\u27s high-performance clusters. We embed a reconfigurable mesh inside a fat-tree switch to generate efficient connections. We present an R-Mesh-based algorithm for a fat-tree switch that creates buses connecting input and output ports corresponding to various communications using that switch