7,215 research outputs found

    Acceleration computing process in wavelength scanning interferometry

    Get PDF
    The optical interferometry has been widely explored for surface measurement due to the advantages of non-contact and high accuracy interrogation. Eventually, some interferometers are used to measure both rough and smooth surfaces such as white light interferometry and wavelength scanning interferometry (WSI). The WSI can be used to measure large discontinuous surface profiles without the phase ambiguity problems. However, the WSI usually needs to capture hundreds of interferograms at different wavelength in order to evaluate the surface finish for a sample. The evaluating process for this large amount of data needs long processing time if CPUs traditional programming is used. This paper presents a parallel programming model to achieve the data parallelism for accelerating the computing analysis of the captured data. This parallel programming is based on CUDATM C program structure that developed by NVIDIA. Additionally, this paper explains the mathematical algorithm that has been used for evaluating the surface profiles. The computing time and accuracy obtained from CUDA program, using GeForce GTX 280 graphics processing unit (GPU), were compared to those obtained from sequential execution Matlab program, using Intel® Core™2 Duo CPU. The results of measuring a step height sample shows that the parallel programming capability of the GPU can highly accelerate the floating point calculation throughput compared to multicore CPU

    Counting Triangles in Large Graphs on GPU

    Full text link
    The clustering coefficient and the transitivity ratio are concepts often used in network analysis, which creates a need for fast practical algorithms for counting triangles in large graphs. Previous research in this area focused on sequential algorithms, MapReduce parallelization, and fast approximations. In this paper we propose a parallel triangle counting algorithm for CUDA GPU. We describe the implementation details necessary to achieve high performance and present the experimental evaluation of our approach. Our algorithm achieves 8 to 15 times speedup over the CPU implementation and is capable of finding 3.8 billion triangles in an 89 million edges graph in less than 10 seconds on the Nvidia Tesla C2050 GPU.Comment: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW
    • …
    corecore