13,079 research outputs found

    Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing

    Get PDF
    A new coarse grain parallel genetic algorithm (PGA) and a new implementation of a data-parallel GA are presented in this paper. They are based on models of natural evolution in which the population is formed of discontinuous or continuous subpopulations. In addition to simulating natural evolution, the intrinsic parallelism in the two PGA\u27s minimizes the possibility of premature convergence that the implementation of classic GA\u27s often encounters. Intrinsic parallelism also allows the evolution of fit genotypes in a smaller number of generations in the PGA\u27s than in sequential GA\u27s, leading to superlinear speed-ups. The PGA\u27s have been implemented on a hypercube and a Connection Machine, and their operation is demonstrated by applying them to the load balancing problem in parallel computing. The PGA\u27s have found near-optimal solutions which are comparable to the solutions of a simulated annealing algorithm and are better than those produced by a sequential GA and by other load balancing methods. On one hand, The PGA\u27s accentuate the advantage of parallel computers for simulating natural evolution. On the other hand, they represent new techniques for load balancing parallel computations

    A resource aware MapReduce based parallel SVM for large scale image classification

    Get PDF
    Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them support vector machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. This paper presents RASMO, a resource aware MapReduce based parallel SVM algorithm for large scale image classifications which partitions the training data set into smaller subsets and optimizes SVM training in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of RASMO in heterogeneous computing environments. RASMO is evaluated in both experimental and simulation environments. The results show that the parallel SVM algorithm reduces the training time significantly compared with the sequential SMO algorithm while maintaining a high level of accuracy in classification

    A Resource Aware MapReduce Based Parallel SVM for Large Scale Image Classifications

    Get PDF
    Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them support vector machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. This paper presents RASMO, a resource aware MapReduce based parallel SVM algorithm for large scale image classifications which partitions the training data set into smaller subsets and optimizes SVM training in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of RASMO in heterogeneous computing environments. RASMO is evaluated in both experimental and simulation environments. The results show that the parallel SVM algorithm reduces the training time significantly compared with the sequential SMO algorithm while maintaining a high level of accuracy in classifications.National Basic Research Program (973) of China under Grant 2014CB34040

    Parallelizing support vector machines for scalable image annotation

    Get PDF
    Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. In this thesis distributed computing paradigms have been investigated to speed up SVM training, by partitioning a large training dataset into small data chunks and process each chunk in parallel utilizing the resources of a cluster of computers. A resource aware parallel SVM algorithm is introduced for large scale image annotation in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of the algorithm in heterogeneous computing environments. SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. A resource aware parallel multiclass SVM algorithm for large scale image annotation in parallel using a cluster of computers is introduced. The combination of classifiers leads to substantial reduction of classification error in a wide range of applications. Among them SVM ensembles with bagging is shown to outperform a single SVM in terms of classification accuracy. However, SVM ensembles training are notably a computationally intensive process especially when the number replicated samples based on bootstrapping is large. A distributed SVM ensemble algorithm for image annotation is introduced which re-samples the training data based on bootstrapping and training SVM on each sample in parallel using a cluster of computers. The above algorithms are evaluated in both experimental and simulation environments showing that the distributed SVM algorithm, distributed multiclass SVM algorithm, and distributed SVM ensemble algorithm, reduces the training time significantly while maintaining a high level of accuracy in classifications.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    A Three-Level Parallelisation Scheme and Application to the Nelder-Mead Algorithm

    Get PDF
    We consider a three-level parallelisation scheme. The second and third levels define a classical two-level parallelisation scheme and some load balancing algorithm is used to distribute tasks among processes. It is well-known that for many applications the efficiency of parallel algorithms of the second and third level starts to drop down after some critical parallelisation degree is reached. This weakness of the two-level template is addressed by introduction of one additional parallelisation level. As an alternative to the basic solver some new or modified algorithms are considered on this level. The idea of the proposed methodology is to increase the parallelisation degree by using less efficient algorithms in comparison with the basic solver. As an example we investigate two modified Nelder-Mead methods. For the selected application, a few partial differential equations are solved numerically on the second level, and on the third level the parallel Wang's algorithm is used to solve systems of linear equations with tridiagonal matrices. A greedy workload balancing heuristic is proposed, which is oriented to the case of a large number of available processors. The complexity estimates of the computational tasks are model-based, i.e. they use empirical computational data
    • 

    corecore