1,235 research outputs found

    Parallel Natural Language Parsing: From Analysis to Speedup

    Get PDF
    Electrical Engineering, Mathematics and Computer Scienc

    Parallel parsing made practical

    Get PDF
    The property of local parsability allows to parse inputs through inspecting only a bounded-length string around the current token. This in turn enables the construction of a scalable, data-parallel parsing algorithm, which is presented in this work. Such an algorithm is easily amenable to be automatically generated via a parser generator tool, which was realized, and is also presented in the following. Furthermore, to complete the framework of a parallel input analysis, a parallel scanner can also combined with the parser. To prove the practicality of a parallel lexing and parsing approach, we report the results of the adaptation of JSON and Lua to a form fit for parallel parsing (i.e. an operator-precedence grammar) through simple grammar changes and scanning transformations. The approach is validated with performance figures from both high performance and embedded multicore platforms, obtained analyzing real-world inputs as a test-bench. The results show that our approach matches or dominates the performances of production-grade LR parsers in sequential execution, and achieves significant speedups and good scaling on multi-core machines. The work is concluded by a broad and critical survey of the past work on parallel parsing and future directions on the integration with semantic analysis and incremental parsing

    Efficient Parallel and Distributed Algorithms for GIS Polygon Overlay Processing

    Get PDF
    Polygon clipping is one of the complex operations in computational geometry. It is used in Geographic Information Systems (GIS), Computer Graphics, and VLSI CAD. For two polygons with n and m vertices, the number of intersections can be O(nm). In this dissertation, we present the first output-sensitive CREW PRAM algorithm, which can perform polygon clipping in O(log n) time using O(n + k + k\u27) processors, where n is the number of vertices, k is the number of intersections, and k\u27 is the additional temporary vertices introduced due to the partitioning of polygons. The current best algorithm by Karinthi, Srinivas, and Almasi does not handle self-intersecting polygons, is not output-sensitive and must employ O(n^2) processors to achieve O(log n) time. The second parallel algorithm is an output-sensitive PRAM algorithm based on Greiner-Hormann algorithm with O(log n) time complexity using O(n + k) processors. This is cost-optimal when compared to the time complexity of the best-known sequential plane-sweep based algorithm for polygon clipping. For self-intersecting polygons, the time complexity is O(((n + k) log n log log n)/p) using p In addition to these parallel algorithms, the other main contributions in this dissertation are 1) multi-core and many-core implementation for clipping a pair of polygons and 2) MPI-GIS and Hadoop Topology Suite for distributed polygon overlay using a cluster of nodes. Nvidia GPU and CUDA are used for the many-core implementation. The MPI based system achieves 44X speedup while processing about 600K polygons in two real-world GIS shapefiles 1) USA Detailed Water Bodies and 2) USA Block Group Boundaries) within 20 seconds on a 32-node (8 cores each) IBM iDataPlex cluster interconnected by InfiniBand technology

    Parallel Parsing in a Multiprocessor Environment

    Get PDF
    Parsing in a multiprocessor environment is considered. Two models for asynchronous bottom-up parallel parsing are presented. A method for estimating speedup in asynchronous bottom-up parallel parsing is developed, and it is used to estimate speedup obtainable by bottom-up parallel parsing of Pascal-like languages. It is found that bottom-up parallel parsing algorithms can attain a maximum speedup of 0 (L1/2) with (L1/2) processors, where L is the number of tokens in the string being parsed. Hence, bottom-up parallel parsing technique does not yield good speedup. A new parsing technique is proposed for parsing a class of block-structured languages. The novelty of the technique is that it is inherently parallel. By applying this new technique, a string of L tokens can be parsed in O (log L) time with (L /log L) processors. The parsing algorithm uses a parenthesis-matching algorithm developed here. The parenthesis-matching algorithm can find matching of a sequence of parentheses in O (log L) time with (L /log L) processors. Thus, the new parsing algorithm is cost optimal
    • ā€¦
    corecore