Search CORE

4 research outputs found

Research on Image Remap Algorithm Optimization Based on OpenCL

Author: 吴再龙
张云泉
徐建良
贾海鹏
龙国平
Publication venue
Publication date: 01/01/2013
Field of study

图像重映射(Remap)算法是典型的图像变化算法。在图像放缩、扭曲、旋转等领域有着广泛的应用。随着图片规模和分辨率的不断提高，对图形映射算法的性能提出了越来越高的要求。本文在充分考虑不同GPU平台硬件体系结构差异的基础上，系统研究了在OpenCL框架下图像映射(Remap)算法在不同GPU平台上的高效实现方式。并从片外内存访存优化，向量化计算，减少动态指令等多个优化角度考察了不同优化方法在不同GPU平台上对性能的影响，提出了在不同GPU平台间实现性能移植的可能性。实验结果表明，优化后的算法在不考虑数据传输时间的前提下，在AMD HD5850 GPU上相对于CPU版本取得114.3~491.5倍的加速比，相对于CUDA版本(现有GPU算法的实现)得到1.01~1.86的加速比，在NIVIDIA C2050 GPU上相对CPU版本取得100.7~369.8倍的加速比，相对于CUDA版本得到0.95~1.58的加速比。有效验证了本文提出的优化方法的有效性和性能可移植性。 As a typical algorithm for image transformation, remap algorithm is widely used in image zooming, warping, rotating and some others. With continuous increase of image’s scale and resolution, higher performance of graphic mapping algorithm has been more and more demanded. Taking full account of the differences of the hardware architectures on different GPU platforms, it is systematically studied in this paper that how remap algorithm based on OpenCL can run effectively on different GPU platforms. By applying memory access optimization of global memory, vectorization calculation, reducing judgments branch and some other optimization methods, we investigated the effects of different optimization on different platforms and suggested the possibility of realizing cross-platform portability. Experimental results showed that without counting the data transfer time, the speedup-ratio is 114.3~491.5 times for AMD HD5850 GPU to CPU version, and 1.01~1.86 times to CUDA version (with present GPU algorithm), and for NIVIDIA C2050 GPU, the speedup-ratio is 100.7~369.8 times to CPU and 0.95~1.58 times to CUDA. These well proved the validity and portability of the optimization methods proposed in this paper

Institute Of Software, Chinese Academy Of Sciences

基于OpenCL的图像矩算法的实现与优化

Author: 吴再龙
张云泉
王伟俨
王靖
解庆春
颜深根
龙国平
Publication venue
Publication date: 01/01/2013
Field of study

Institute Of Software, Chinese Academy Of Sciences

Research on Kmeans Algorithm Optimization Based on OpenCL

Author: 吴再龙
张云泉
徐建良
王伟俨
贾海鹏
颜深根
Publication venue
Publication date: 01/01/2013
Field of study

　　Kmeans算法是典型的聚类算法，是已知数据划分和分组处理的重要方法。在图像处理、机器学习、生物学有着广泛的应用。随着数据规模的不断变大，对Kmeans算法的性能提出了越来越高的要求。本文在充分考虑不同硬件平台硬件体系结构差异的基础上，系统研究了在OpenCL框架下Kmeans算法在GPU和APu平台上的高效实现方式。并使用含有多次全局同步的迭代算法在GPU中的实现、冗余计算减少全局同步次数、线程任务的再分配、Local memory的重用等多个方法完成了Kmeans算法在不同硬件结构上的高效实现，并总结了一套适用于迭代算法的优化方法。实验结果表明，优化后的算法在考虑数据传输时间的前提下，在AMD HD7970 GPU上相对于CPU版本取得136 975M70 333倍的加速比，在AMDA10—5800KAPU上相对于CPU版本取得22 2365～24 3865倍的加速比。有效验证了本文提出的优化方法的有效性和平台的可移植性

Institute Of Software, Chinese Academy Of Sciences

Observation of the Superheavy Nuclide ~(271)Ds

Author: Gan ZG(甘再国)
Huang MH(黄明辉)
Huang TH(黄天衡)
Jia GB(贾国斌)
Li GS(李广顺)
Ma L(马龙)
Ren ZZ(任中洲)
Wu XL(吴晓蕾)
Xiao GQ(肖国青)
Xu HS(徐瑚珊)
Yu L(郁琳)
Zhan WL(詹文龙)
Zhang HQ(张焕乔)
Zhang YH(张玉虎)
Zhang ZY(张志远)
Zhou SG(周善贵)
Zhou XH(周小红)
Publication venue
Publication date
Field of study

<span style="color: rgb(51, 51, 51); font-family: arial, helvetica, sans-serif; font-size: 13px; line-height: 22px; background-color: rgb(248, 248, 248);">With the recent commissioning of a gas-filled recoil separator at Institute of Modern Physics (IMP) in Lanzhou, the decay properties of (271)Ds (Z = 110) were studied via the Pb-208(Ni-64, n) reaction at a beam energy of 313.3MeV. Based on the separator coupled with a position sensitive silicon strip detector, we carried out the energy-position-time correlation measurements for the implanted nucleus and its subsequent decay alpha's. One alpha-decay chain for (271)Ds was established. The.. energy and decay time of the (271)Ds nucleus were measured to be 10.644 MeV and 96.8 ms, which are consistent with the values reported in the literature</span

Institutional Repository of Institute of Modern Physics, CAS