7 research outputs found

    Implementation of digital pheromones in PSO accelerated by commodity Graphics Hardware

    Get PDF
    In this paper, a model for Graphics Processing Unit (GPU) implementation of Particle Swarm Optimization (PSO) using digital pheromones to coordinate swarms within ndimensional design spaces is presented. Previous work by the authors demonstrated the capability of digital pheromones within PSO for searching n-dimensional design spaces with improved accuracy, efficiency and reliability in both serial and parallel computing environments using traditional CPUs. Modern GPUs have proven to outperform the number of floating point operations when compared to CPUs through inherent data parallel architecture and higher bandwidth capabilities. The advent of programmable graphics hardware in the recent times further provided a suitable platform for scientific computing particularly in the field of design optimization. However, the data parallel architecture of GPUs requires a specialized formulation for leveraging its computational capabilities. When the objective function computations are appropriately formulated for GPUs, it is theorized that the solution efficiency (speed) can be significantly increased while maintaining solution accuracy. The development of this method together with a number of multi-modal unconstrained test problems are tested and presented in this paper

    Digital Pheromone Implementation of PSO with Velocity Vector Accelerated by Commodity Graphics Hardware

    Get PDF
    In this paper, a model for Graphics Processing Unit (GPU) implementation of Particle Swarm Optimization (PSO) using digital pheromones to coordinate swarms within ndimensional design spaces is presented. Particularly, the velocity vector computations are carried out on graphics hardware. Previous work by the authors demonstrated the capability of digital pheromones within PSO for searching n-dimensional design spaces with improved accuracy, efficiency and reliability in serial, parallel and GPU computing environments. The GPU implementation was limited to computing the objective function values alone. Modern GPUs have proven to outperform the number of floating point operations when compared to CPUs through inherent data parallel architecture and higher bandwidth capabilities. This paper presents a method to implement velocity vector computations on a GPU along with objective function evaluations. Three different modes of implementation are studied and presented - First, CPU-CPU where objective function and velocity vector are calculated on CPU alone. Second, GPU-CPU where objective function is computed on the GPU and velocity vector is computed on GPU. Third, GPU-GPU where objective function and velocity vector are both evaluated on the GPU. The results from these three implementations are presented followed by conclusions and recommendations on the best approach for utilizing the full potential of GPUs for PSO

    High performance bioinformatics and computational biology on general-purpose graphics processing units

    Get PDF
    Bioinformatics and Computational Biology (BCB) is a relatively new multidisciplinary field which brings together many aspects of the fields of biology, computer science, statistics, and engineering. Bioinformatics extracts useful information from biological data and makes these more intuitive and understandable by applying principles of information sciences, while computational biology harnesses computational approaches and technologies to answer biological questions conveniently. Recent years have seen an explosion of the size of biological data at a rate which outpaces the rate of increases in the computational power of mainstream computer technologies, namely general purpose processors (GPPs). The aim of this thesis is to explore the use of off-the-shelf Graphics Processing Unit (GPU) technology in the high performance and efficient implementation of BCB applications in order to meet the demands of biological data increases at affordable cost. The thesis presents detailed design and implementations of GPU solutions for a number of BCB algorithms in two widely used BCB applications, namely biological sequence alignment and phylogenetic analysis. Biological sequence alignment can be used to determine the potential information about a newly discovered biological sequence from other well-known sequences through similarity comparison. On the other hand, phylogenetic analysis is concerned with the investigation of the evolution and relationships among organisms, and has many uses in the fields of system biology and comparative genomics. In molecular-based phylogenetic analysis, the relationship between species is estimated by inferring the common history of their genes and then phylogenetic trees are constructed to illustrate evolutionary relationships among genes and organisms. However, both biological sequence alignment and phylogenetic analysis are computationally expensive applications as their computing and memory requirements grow polynomially or even worse with the size of sequence databases. The thesis firstly presents a multi-threaded parallel design of the Smith- Waterman (SW) algorithm alongside an implementation on NVIDIA GPUs. A novel technique is put forward to solve the restriction on the length of the query sequence in previous GPU-based implementations of the SW algorithm. Based on this implementation, the difference between two main task parallelization approaches (Inter-task and Intra-task parallelization) is presented. The resulting GPU implementation matches the speed of existing GPU implementations while providing more flexibility, i.e. flexible length of sequences in real world applications. It also outperforms an equivalent GPPbased implementation by 15x-20x. After this, the thesis presents the first reported multi-threaded design and GPU implementation of the Gapped BLAST with Two-Hit method algorithm, which is widely used for aligning biological sequences heuristically. This achieved up to 3x speed-up improvements compared to the most optimised GPP implementations. The thesis then presents a multi-threaded design and GPU implementation of a Neighbor-Joining (NJ)-based method for phylogenetic tree construction and multiple sequence alignment (MSA). This achieves 8x-20x speed up compared to an equivalent GPP implementation based on the widely used ClustalW software. The NJ method however only gives one possible tree which strongly depends on the evolutionary model used. A more advanced method uses maximum likelihood (ML) for scoring phylogenies with Markov Chain Monte Carlo (MCMC)-based Bayesian inference. The latter was the subject of another multi-threaded design and GPU implementation presented in this thesis, which achieved 4x-8x speed up compared to an equivalent GPP implementation based on the widely used MrBayes software. Finally, the thesis presents a general evaluation of the designs and implementations achieved in this work as a step towards the evaluation of GPU technology in BCB computing, in the context of other computer technologies including GPPs and Field Programmable Gate Arrays (FPGA) technology

    A feature-based shape similarity assessment framework

    Get PDF
    The popularity of 3D CAD systems is resulting in a large number of CAD models being generated. Availability of these CAD models is opening up new ways in which information can be archived, analyzed, and reused. 3D geometric information is one of the main components of CAD models. Therefore shape similarity assessment is a fundamental geometric reasoning problem that finds several different applications. In many design and manufacturing applications, the gross shape of the 3D parts does not play an important role in the similarity assessment. Instead certain attributes of part features play a dominant role in determining the similarity between two parts. Different feature-based models are usually created using their own coordinate systems. Therefore, feature-based shape similarity assessment involves finding the optimal alignment transformations for two sets of feature vectors. The optimal alignment corresponds to the minimum value of a distance function that is computed between the two sets of feature vectors being aligned. In order to compute the distance function the closest neighbor to each feature vector needs to be identified. We have developed optimal feature alignment algorithms based on the partitioning of the transformation space into regions such that the closest neighbors are invariant within each region. These algorithms can work with customizable distance functions. We have shown that they have polynomial time complexity. For higher dimension transformation spaces it is harder to design algorithms based on the partitioning of transformation spaces because the data structures involved are very complex. In those cases, feature alignment algorithms based on iterative strategies have been developed. Iterative strategies make use of optimal feature alignment algorithms based on the partitioning of lower dimension transformation spaces. Extensive experiments have been carried out to provide empirical evidence that iterative strategies can find the optimal solution for feature alignment problems. A feature-based shape similarity analysis framework has been built based on the feature alignment algorithms. This framework has been demonstrated with the two following applications. A machining feature based alignment algorithm has been developed to automatically search databases for parts that are similar to a newly designed part in terms of machining features. We expect that the retrieved parts can be used as a basis to perform cost estimation of the newly designed part. A surface feature based alignment algorithm has been developed to automatically search databases for parts that are similar to a newly designed part in terms of surface features. We expect that the retrieved parts can be used as a basis to choose the most appropriate tool maker for the newly designed part. We believe that the feature-based shape similarity assessment algorithms developed in this thesis will provide the foundations for designing new feature-based shape similarity algorithms that will enable designers to efficiently retrieve archived geometric information. We expect that these tools will facilitate information reuse and therefore decrease product development time and cost

    Efficient algorithms for new computational models

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.Includes bibliographical references (p. 155-163).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Advances in hardware design and manufacturing often lead to new ways in which problems can be solved computationally. In this thesis we explore fundamental problems in three computational models that are based on such recent advances. The first model is based on new chip architectures, where multiple independent processing units are placed on one chip, allowing for an unprecedented parallelism in hardware. We provide new scheduling algorithms for this computational model. The second model is motivated by peer-to-peer networks, where countless (often inexpensive) computing devices cooperate in distributed applications without any central control. We state and analyze new algorithms for load balancing and for locality-aware distributed data storage in peer-to-peer networks. The last model is based on extensions of the streaming model. It is an attempt to capture the class of problems that can be efficiently solved on massive data sets. We give a number of algorithms for this model, and compare it to other models that have been proposed for massive data set computations. Our algorithms and complexity results for these computational models follow the central thesis that it is an important part of theoretical computer science to model real-world computational structures, and that such effort is richly rewarded by a plethora of interesting and challenging problems.by Jan Matthias Ruhl.Ph.D
    corecore