Search CORE

3 research outputs found

分布式框架Hadoop-GPGPU的设计与实现

Author: 温腊
Publication venue
Publication date: 27/05/2014
Field of study

Institute Of Software, Chinese Academy Of Sciences

Accelerating ASIFT Based on CPU/GPU Synergetic Parallel Computing

Author: 何婷婷
温腊
芮建武
Publication venue
Publication date: 01/01/2014
Field of study

ASIFT(Affine-SIFT)是一种具有仿射不变性、尺度不变性的特征提取算法,其被用于图像匹配中,具有较好的匹配效果,但因计算复杂度高而难以运用到实时处理中.在分析ASIFT算法运行耗时分布的基础上,先对SIFT算法进行了GPU优化,通过使用共享内存、合并访存,提高了数据访问效率.之后对ASIFT计算中的其它部分进行GPU优化,形成GASIFT.整个GASⅡT计算过程中使用显存池来减少对显存的申请和释放.最后分别在CPU/GPU协同工作的两种方式上进行了尝试.实验表明,CPU负责逻辑计算、GPU负责并行计算的模式最适合于GAS-IFT计算,在该模式下GASIFT有很好的加速效果,尤其针对大、中图片.对于2048* 1536的大图片,GASIFT与标准ASIFT相比加速比可达16倍,与OpenMP优化过的ASIFT相比加速比可达7倍,极大地提高了ASIFT在实时计算中应用的可能性.ASIFT(affine-SIFT)is a fully affine invariant,and scale invariant image local feature extraction algorithm.It has a good result in image matching.But because of its high computational complexity,it cannot be applied to real-time processing.Thus GPU is used to accelerate ASIFT.Based on the analysis of running time of ASIFT,firstly SIFT was adapted to GPU,and then the other parts of ASIFT.Memory pool was used in GASIFT to avoid frequently allocating and deleting memory during the runtime.Different ways of CPU/GPU synergetic parallel computing were studied to make GASIFT more efficient.Experiments show that the model in which CPU takes the logical calculation work and GPU makes parallel computing is the most suitable way.Based on this model,GASIFT has a good speed-up ratio over other methods.That's 16times compared with traditional ASIFT,and 7times compared with OpenMP optimized ASIFT

Institute Of Software, Chinese Academy Of Sciences

Accelerating hierarchical distributed latent Dirichlet allocation algorithm by parallel GPU

Author: 何婷婷
温腊
芮建武
郭亮
Publication venue
Publication date: 01/01/2013
Field of study

　　分层分布式狄利克雷分布(HD-LDA)算法是一个对潜在狄利克雷分布(LDA)进行改进的基于概率增长模型的文本分类算法，与只能在单机上运行的LDA算法相比，可以运行在分布式框架下，进行分布式并行处理.Mahout在Hadoop框架下实现了HD-LDA算法，但是因为单节点算法的计算量大，仍然存在对大数据分类运行时间太长的问题.而大规模文本集合分散到多个节点上迭代推导，单个节点上文档集合的推导仍是顺序进行的，所以处理大规模文本集合时仍然需要很长时间才能完成全部文本的分类.为此，提出将Hadoop与图形处理器(GPU)相结合，将单节点文本集合的推导过程转移到GPU上运行，实现单节点多个文档并行推导，利用多台并行的GPU对HD-LDA算法进行加速.应用结果表明，使用该方法能使分布式框架下的HD-LDA算法对大规模文本集合处理达到7倍的加速比.Hierarchical Distributed Latent Dirichlet Allocation (HD-LDA), a popular topic modeling technique for exploring collections, is an improved Latent Dirichlet Allocation (LDA) algorithm running in distributed environment. Mahout has realized HD-LDA algorithm in the framework of Hadoop. However the algorithm processed the whole documents of a single node in sequence, and the execution time of the HD-LDA program was very long when processing a large amount of documents. A new method was proposed to combine Hadoop with Graphic Processing Unit (GPU) to solve the above problem when transferring the computation from CPU to GPU. The application results show that combining the Hadoop with GPU which processes many documents in parallel can decrease the execution time of HD-LDA program greatly and achieve seven times speedup

Institute Of Software, Chinese Academy Of Sciences