21 research outputs found

    An optimal EREW parallel algorithm for parenthesis matching

    Get PDF
    published_or_final_versio

    Implementation of parallel algorithm for run of k-local tree automata

    Get PDF
    Tato práce se zabývá k-lokálními deterministickými konečnými stromovými automaty (DKSA), které hrají důležitou roli při hledání vzorů ve stromových strukturách. Existuje pracovně optimální paralelní algoritmus pro běh k-lokálních DKSA na výpočetním modelu EREW PRAM. Tento algoritmus bude implementován, experimentálně změřen a porovnán se sekvenčním algoritmem v této práci.This thesis deals with k-local deterministic finite tree automata (DFTA) which are important for tree pattern matching. There exists a work-optimal parallel algorithm for a run of k-local DFTA on EREW PRAM. This algorithm will be implemented, experimentally measured and compared with the sequential algorithm in this thesis

    Optimal parallel algorithms for rectilinear link-distance problems

    Get PDF
    We provide optimal parallel solutions to several link-distance problems set in trapezoided rectilinear polygons. All our main parallel algorithms are deterministic and designed to run on the exclusive read exclusive write parallel random access machine (EREW PRAM). Let P be a trapezoided rectilinear simple polygon with n vertices. In O(log n) time using O(n/log n) processors we can optimally compute: 1. Minimum réctilinear link paths, or shortest paths in the L1 metric from any point in P to all vertices of P. 2. Minimum rectilinear link paths from any segment inside P to all vertices of P. 3. The rectilinear window (histogram) partition of P. 4. Both covering radii and vertex intervals for any diagonal of P. 5. A data structure to support rectilinear link-distance queries between any two points in P (queries can be answered optimally in O(log n) time by uniprocessor). Our solution to 5 is based on a new linear-time sequential algorithm for this problem which is also provided here. This improves on the previously best-known sequential algorithm for this problem which used O(n log n) time and space.5 We develop techniques for solving link-distance problems in parallel which are expected to find applications in the design of other parallel computational geometry algorithms. We employ these parallel techniques, for example, to compute (on a CREW PRAM) optimally the link diameter, the link center, and the central diagonal of a rectilinear polygon

    Parallel Parsing in a Multiprocessor Environment

    Get PDF
    Parsing in a multiprocessor environment is considered. Two models for asynchronous bottom-up parallel parsing are presented. A method for estimating speedup in asynchronous bottom-up parallel parsing is developed, and it is used to estimate speedup obtainable by bottom-up parallel parsing of Pascal-like languages. It is found that bottom-up parallel parsing algorithms can attain a maximum speedup of 0 (L1/2) with (L1/2) processors, where L is the number of tokens in the string being parsed. Hence, bottom-up parallel parsing technique does not yield good speedup. A new parsing technique is proposed for parsing a class of block-structured languages. The novelty of the technique is that it is inherently parallel. By applying this new technique, a string of L tokens can be parsed in O (log L) time with (L /log L) processors. The parsing algorithm uses a parenthesis-matching algorithm developed here. The parenthesis-matching algorithm can find matching of a sequence of parentheses in O (log L) time with (L /log L) processors. Thus, the new parsing algorithm is cost optimal

    Adaptive Massively Parallel Constant-Round Tree Contraction

    Get PDF
    Miller and Reif's FOCS'85 classic and fundamental tree contraction algorithm is a broadly applicable technique for the parallel solution of a large number of tree problems. Additionally it is also used as an algorithmic design technique for a large number of parallel graph algorithms. In all previously explored models of computation, however, tree contractions have only been achieved in Ω(logn)\Omega(\log n) rounds of parallel run time. In this work, we not only introduce a generalized tree contraction method but also show it can be computed highly efficiently in O(1/ϵ3)O(1/\epsilon^3) rounds in the Adaptive Massively Parallel Computing (AMPC) setting, where each machine has O(nϵ)O(n^\epsilon) local memory for some 0<ϵ<10 < \epsilon < 1. AMPC is a practical extension of Massively Parallel Computing (MPC) which utilizes distributed hash tables. In general, MPC is an abstract model for MapReduce, Hadoop, Spark, and Flume which are currently widely used across industry and has been studied extensively in the theory community in recent years. Last but not least, we show that our results extend to multiple problems on trees, including but not limited to maximum and maximal matching, maximum and maximal independent set, tree isomorphism testing, and more.Comment: 35 pages, 3 figures, to be published in Innovations in Theoretical Computer Science (ITCS

    Parallel Algorithm for Finding All Minimal Maximum Subsequences via Random-walk Theory

    Get PDF
    A maximum contiguous subsequence of a real-valued sequence is a contiguous subsequence with the maximum cumulative sum. A minimal maximum contiguous subsequence is a minimal contiguous subsequence among all maximum ones of the sequence. Time- and space-efficient algorithms for finding the single or multiple contiguous subsequences of a real-valued sequence with large cumulative sums, in addition to its combinatorial appeal, have major applications such as in bioinformatics, pattern matching, and data mining. We have designed and implemented a domain-decomposed parallel algorithm on cluster systems with Message Passing Interface that finds all minimal maximum subsequences of a random sample sequence from a normal distribution with negative mean. We find the structural decomposition of the sequence with overlapping common subsequences for adjacent processors that allow hosting processors to compute the minimal maximum subsequences of the sequence independently. Our study employs the theory of random walk to derive an approximate probabilistic length bound for common subsequences in an appropriate probabilistic setting, which is incorporated in the algorithm to facilitate the concurrent computation of all minimal maximum subsequences in hosting processors. We also present an empirical study of the speedup and efficiency achieved by the parallel algorithm with synthetic random data.Computer Scienc

    Fast parallel algorithms for approximate string matching

    Get PDF

    Scheduling and reconfiguration of interconnection network switches

    Get PDF
    Interconnection networks are important parts of modern computing systems, facilitating communication between a system\u27s components. Switches connecting various nodes of an interconnection network serve to move data in the network. The switch\u27s delay and throughput impact the overall performance of the network and thus the system. Scheduling efficient movement of data through a switch and configuring the switch to realize a schedule are the main themes of this research. We consider various interconnection network switches including (i) crossbar-based switches, (ii) circuit-switched tree switches, and (iii) fat-tree switches. For crossbar-based input-queued switches, a recent result established that logarithmic packet delay is possible. However, this result assumes that packet transmission time through the switch is no less than schedule-generation time. We prove that without this assumption (as is the case in practice) packet delay becomes linear. We also report results of simulations that bear out our result for practical switch sizes and indicate that a fast scheduling algorithm reduces not only packet delay but also buffer size. We also propose a fast mesh-of-trees based distributed switch scheduling (maximal-matching based) algorithm that has polylog complexity. A circuit-switched tree (CST) can serve as an interconnect structure for various computing architectures and models such as the self-reconfigurable gate array and the reconfigurable mesh. A CST is a tree structure with source and destination processing elements as leaves and switches as internal nodes. We design several scheduling and configuration algorithms that distributedly partition a given set of communications into non-conflicting subsets and then establish switch settings and paths on the CST corresponding to the communications. A fat-tree is another widely used interconnection structure in many of today\u27s high-performance clusters. We embed a reconfigurable mesh inside a fat-tree switch to generate efficient connections. We present an R-Mesh-based algorithm for a fat-tree switch that creates buses connecting input and output ports corresponding to various communications using that switch
    corecore