12,773 research outputs found

    Automatic parallelization by pattern-matching

    Full text link

    KPN-based parallelization of Wu–Manber algorithm on multi-core machines

    Get PDF
    © 2019, Springer Science+Business Media, LLC, part of Springer Nature. Pattern matching is the most time consuming task in many cybersecurity, bioinformatics and computational biological applications. Speeding up the pattern matching task is an essential step for the success of the aforementioned applications. Wu–Manber algorithm is one of the fastest and most widely used algorithms for multi-pattern matching. Many researchers focused on improving the performance of Wu–Manber algorithm and this work presents a novel attempt parallelize Wu–Manber and make it suitable for multi-core machines. This paper uses Kahn processing network (KPN) model to effectively parallelize data and functional tasks. KPN suggests a parallel programming model that can be utilized in today’s multi-core machines. Hence, we use the KPN model to tailor the execution of Wu–Manber algorithm by breaking down the complexity of data sharing and task processing. The data parallelization is implemented using concurrent executions of multiple KPNs. In addition, task parallelization is achieved within each executing KPN. A single KPN consists of two threads, a producer thread and a consumer thread. The proposed KPN-based parallelization achieves up to 4× speedup over the serial implementation of the algorithm. Finally, the algorithm performance scales well with increasing workloads and the speedup up remains almost constant with increasing number of attack signatures

    Shared Memory Parallel Subgraph Enumeration

    Full text link
    The subgraph enumeration problem asks us to find all subgraphs of a target graph that are isomorphic to a given pattern graph. Determining whether even one such isomorphic subgraph exists is NP-complete---and therefore finding all such subgraphs (if they exist) is a time-consuming task. Subgraph enumeration has applications in many fields, including biochemistry and social networks, and interestingly the fastest algorithms for solving the problem for biochemical inputs are sequential. Since they depend on depth-first tree traversal, an efficient parallelization is far from trivial. Nevertheless, since important applications produce data sets with increasing difficulty, parallelism seems beneficial. We thus present here a shared-memory parallelization of the state-of-the-art subgraph enumeration algorithms RI and RI-DS (a variant of RI for dense graphs) by Bonnici et al. [BMC Bioinformatics, 2013]. Our strategy uses work stealing and our implementation demonstrates a significant speedup on real-world biochemical data---despite a highly irregular data access pattern. We also improve RI-DS by pruning the search space better; this further improves the empirical running times compared to the already highly tuned RI-DS.Comment: 18 pages, 12 figures, To appear at the 7th IEEE Workshop on Parallel / Distributed Computing and Optimization (PDCO 2017
    • …
    corecore