2 research outputs found

    Enhancing the performance of the aggregated bit vector algorithm in network packet classification using GPU

    Get PDF
    Packet classification is a computationally intensive, highly parallelizable task in many advanced network systems like high-speed routers and firewalls that enable different functionalities through discriminating incoming traffic. Recently, graphics processing units (GPUs) have been exploited as efficient accelerators for parallel implementation of software classifiers. The aggregated bit vector is a highly parallelizable packet classification algorithm. In this work, first we present a parallel kernel for running this algorithm on GPUs. Next, we adapt an asymptotic analysis method which predicts any empirical result of the proposed kernel. Experimental results not only confirm the efficiency of the proposed parallel kernel but also reveal the accuracy of the analysis method in predicting important trends in experimental results

    A GPU-based solution for fast calculation of the betweenness centrality in large weighted networks

    No full text
    Betweenness, a widely employed centrality measure in network science, is a decent proxy for investigating network loads and rankings. However, its extremely high computational cost greatly hinders its applicability in large networks. Although several parallel algorithms have been presented to reduce its calculation cost for unweighted networks, a fast solution for weighted networks, which are commonly encountered in many realistic applications, is still lacking. In this study, we develop an efficient parallel GPU-based approach to boost the calculation of the betweenness centrality (BC) for large weighted networks. We parallelize the traditional Dijkstra algorithm by selecting more than one frontier vertex each time and then inspecting the frontier vertices simultaneously. By combining the parallel SSSP algorithm with the parallel BC framework, our GPU-based betweenness algorithm achieves much better performance than its CPU counterparts. Moreover, to further improve performance, we integrate the work-efficient strategy, and to address the load-imbalance problem, we introduce a warp-centric technique, which assigns many threads rather than one to a single frontier vertex. Experiments on both realistic and synthetic networks demonstrate the efficiency of our solution, which achieves 2.9× to 8.44× speedups over the parallel CPU implementation. Our algorithm is open-source and free to the community; it is publicly available through https://dx.doi.org/10.6084/m9.figshare.4542405. Considering the pervasive deployment and declining price of GPUs in personal computers and servers, our solution will offer unprecedented opportunities for exploring betweenness-related problems and will motivate follow-up efforts in network science
    corecore