693 research outputs found

    Parallel local search for solving Constraint Problems on the Cell Broadband Engine (Preliminary Results)

    Full text link
    We explore the use of the Cell Broadband Engine (Cell/BE for short) for combinatorial optimization applications: we present a parallel version of a constraint-based local search algorithm that has been implemented on a multiprocessor BladeCenter machine with twin Cell/BE processors (total of 16 SPUs per blade). This algorithm was chosen because it fits very well the Cell/BE architecture and requires neither shared memory nor communication between processors, while retaining a compact memory footprint. We study the performance on several large optimization benchmarks and show that this achieves mostly linear time speedups, even sometimes super-linear. This is possible because the parallel implementation might explore simultaneously different parts of the search space and therefore converge faster towards the best sub-space and thus towards a solution. Besides getting speedups, the resulting times exhibit a much smaller variance, which benefits applications where a timely reply is critical

    Performance analysis and optimization of parallel Best-First Search algorithms on multicore and cluster of multicore

    Get PDF
    The contribution of the thesis is the development of two parallel Best-First Search algorithms, one that is suitable for execution on shared-memory machines (multicore), and another one that is suitable for execution on distributed memory machines (cluster). The former is based on the adaptation of the HDA* (Hash Distributed A*) algorithm for multicore machines proposed by (Burns et al., 2010), while the latter is based on the HDA* (Hash Distributed A*) algorithm proposed by (Kishimoto, et al., 2013). The implemented algorithms incorporate parameters and/or techniques that improve their performance, with respect to the original algorithms proposed by the authors mentioned above.Es revisión de: http://sedici.unlp.edu.ar/handle/10915/44478Resumen de la tesis presentada por la autora para obtener el título de Doctor en Ciencias Informáticas (UNLP, 2015).Facultad de Informátic

    Performance analysis and optimization of parallel Best-First Search algorithms on multicore and cluster of multicore

    Get PDF
    The contribution of the thesis is the development of two parallel Best-First Search algorithms, one that is suitable for execution on shared-memory machines (multicore), and another one that is suitable for execution on distributed memory machines (cluster). The former is based on the adaptation of the HDA* (Hash Distributed A*) algorithm for multicore machines proposed by (Burns et al., 2010), while the latter is based on the HDA* (Hash Distributed A*) algorithm proposed by (Kishimoto, et al., 2013). The implemented algorithms incorporate parameters and/or techniques that improve their performance, with respect to the original algorithms proposed by the authors mentioned above.Es revisión de: http://sedici.unlp.edu.ar/handle/10915/44478Resumen de la tesis presentada por la autora para obtener el título de Doctor en Ciencias Informáticas (UNLP, 2015).Facultad de Informátic

    Distributed-Memory Breadth-First Search on Massive Graphs

    Full text link
    This chapter studies the problem of traversing large graphs using the breadth-first search order on distributed-memory supercomputers. We consider both the traditional level-synchronous top-down algorithm as well as the recently discovered direction optimizing algorithm. We analyze the performance and scalability trade-offs in using different local data structures such as CSR and DCSC, enabling in-node multithreading, and graph decompositions such as 1D and 2D decomposition.Comment: arXiv admin note: text overlap with arXiv:1104.451

    Soft Computing Techiniques for the Protein Folding Problem on High Performance Computing Architectures

    Get PDF
    The protein-folding problem has been extensively studied during the last fifty years. The understanding of the dynamics of global shape of a protein and the influence on its biological function can help us to discover new and more effective drugs to deal with diseases of pharmacological relevance. Different computational approaches have been developed by different researchers in order to foresee the threedimensional arrangement of atoms of proteins from their sequences. However, the computational complexity of this problem makes mandatory the search for new models, novel algorithmic strategies and hardware platforms that provide solutions in a reasonable time frame. We present in this revision work the past and last tendencies regarding protein folding simulations from both perspectives; hardware and software. Of particular interest to us are both the use of inexact solutions to this computationally hard problem as well as which hardware platforms have been used for running this kind of Soft Computing techniques.This work is jointly supported by the FundaciónSéneca (Agencia Regional de Ciencia y Tecnología, Región de Murcia) under grants 15290/PI/2010 and 18946/JLI/13, by the Spanish MEC and European Commission FEDER under grant with reference TEC2012-37945-C02-02 and TIN2012-31345, by the Nils Coordinated Mobility under grant 012-ABEL-CM-2014A, in part financed by the European Regional Development Fund (ERDF). We also thank NVIDIA for hardware donation within UCAM GPU educational and research centers.Ingeniería, Industria y Construcció

    GPGPU for Difficult Black-box Problems

    Get PDF
    AbstractDifficult black-box problems arise in many scientific and industrial areas. In this paper, efficient use of a hardware accelerator to implement dedicated solvers for such problems is discussed and studied based on an example of Golomb Ruler problem. The actual solution of the problem is shown based on evolutionary and memetic algorithms accelerated on GPGPU. The presented results prove that GPGPU outperforms CPU in some memetic algorithms which can be used as a part of hybrid algorithm of finding near optimal solutions of Golomb Ruler problem. The presented research is a part of building heterogenous parallel algorithm for difficult black-box Golomb Ruler problem

    METADOCK: A parallel metaheuristic schema for virtual screening methods

    Get PDF
    Virtual screening through molecular docking can be translated into an optimization problem, which can be tackled with metaheuristic methods. The interaction between two chemical compounds (typically a protein, enzyme or receptor, and a small molecule, or ligand) is calculated by using highly computationally demanding scoring functions that are computed at several binding spots located throughout the protein surface. This paper introduces METADOCK, a novel molecular docking methodology based on parameterized and parallel metaheuristics and designed to leverage heterogeneous computers based on heterogeneous architectures. The application decides the optimization technique at running time by setting a configuration schema. Our proposed solution finds a good workload balance via dynamic assignment of jobs to heterogeneous resources which perform independent metaheuristic executions when computing different molecular interactions required by the scoring functions in use. A cooperative scheduling of jobs optimizes the quality of the solution and the overall performance of the simulation, so opening a new path for further developments of virtual screening methods on high-performance contemporary heterogeneous platforms.Ingeniería, Industria y Construcció
    corecore