20 research outputs found

    GPGPU for Difficult Black-box Problems

    Get PDF
    AbstractDifficult black-box problems arise in many scientific and industrial areas. In this paper, efficient use of a hardware accelerator to implement dedicated solvers for such problems is discussed and studied based on an example of Golomb Ruler problem. The actual solution of the problem is shown based on evolutionary and memetic algorithms accelerated on GPGPU. The presented results prove that GPGPU outperforms CPU in some memetic algorithms which can be used as a part of hybrid algorithm of finding near optimal solutions of Golomb Ruler problem. The presented research is a part of building heterogenous parallel algorithm for difficult black-box Golomb Ruler problem

    Towards hybrid methods for solving hard combinatorial optimization problems

    Full text link
    Tesis doctoral leída en la Escuela Politécnica Superior de la Universidad Autónoma de Madrid el 4 de septiembre de 200

    GPU parallelization strategies for metaheuristics: a survey

    Get PDF
    Metaheuristics have been showing interesting results in solving hard optimization problems. However, they become limited in terms of effectiveness and runtime for high dimensional problems. Thanks to the independency of metaheuristics components, parallel computing appears as an attractive choice to reduce the execution time and to improve solution quality. By exploiting the increasing performance and programability of graphics processing units (GPUs) to this aim, GPU-based parallel metaheuristics have been implemented using different designs. RecentresultsinthisareashowthatGPUstendtobeeffectiveco-processors forleveraging complex optimization problems.In thissurvey, mechanisms involvedinGPUprogrammingforimplementingparallelmetaheuristicsare presentedanddiscussedthroughastudyofrelevantresearchpapers. Metaheuristics can obtain satisfying results when solving optimization problems in a reasonable time. However, they suffer from the lack of scalability. Metaheuristics become limited ahead complex highdimensional optimization problems. To overcome this limitation, GPU based parallel computing appears as a strong alternative. Thanks to GPUs, parallelmetaheuristicsachievedbetterresultsintermsofcomputation,and evensolutionquality

    Massivel y parallel declarative computational models

    Get PDF
    Current computer archictectures are parallel, with an increasing number of processors. Parallel programming is an error-prone task and declarative models such as those based on constraints relieve the programmer from some of its difficult aspects, because they abstract control away. In this work we study and develop techniques for declarative computational models based on constraints using GPI, aiming at large scale parallel execution. The main contributions of this work are: A GPI implementation of a scalable dynamic load balancing scheme based on work stealing, suitable for tree shaped computations and effective for systems with thousands of threads. A parallel constraint solver, MaCS, implemented to take advantage of the GPI programming model. Experimental evaluation shows very good scalability results on systems with hundreds of cores. A GPI parallel version of the Adaptive Search algorithm, including different variants. The study on different problems advances the understanding of scalability issues known to exist with large numbers of cores; ### SUMÁRIO: Actualmente as arquitecturas de computadores são paralelas, com um crescente número de processadores. A programação paralela é uma tarefa propensa a erros e modelos declarativos baseados em restrições aliviam o programador de aspectos difíceis dado que abstraem o controlo. Neste trabalho estudamos e desenvolvemos técnicas para modelos de computação declarativos baseados em restrições usando o GPI, uma ferramenta e modelo de programação recente. O Objectivo é a execução paralela em larga escala. As contribuições deste trabalho são as seguintes: a implementação de um esquema dinâmico para balanceamento da computação baseado no GPI. O esquema é adequado para computações em árvores e efectiva em sistemas compostos por milhares de unidades de computação. Uma abordagem à resolução paralela de restrições denominadas de MaCS, que tira partido do modelo de programação do GPI. A Avaliação experimental revelou boa escalabilidade num sistema com centenas de processadores. Uma versão paralela do algoritmo Adaptive Search baseada no GPI, que inclui diferentes variantes. O estudo de diversos problemas aumenta a compreensão de aspectos relacionados com a escalabilidade e presentes na execução deste tipo de algoritmos num grande número de processadores

    Efficient Methods for Finding Optimal Convolutional Self-Doubly Orthogonal Codes

    Get PDF
    Résumé: Au cours des dernières années, la hausse sans précédent du nombre d'ultrabooks et d'appareils mobiles s'est accompagnée d'un besoin toujours croissant d'accès aux technologies permettant des communications sans-fil fiables et à haut débit. Pour atténuer ou éliminer les erreurs induites par les interférences et le bruit dans les canaux de communication, il est important de développer des systèmes de codage efficaces pour la correction d'erreurs. En effet, lors de communications de données numériques sur un canal ayant un faible rapport signal sur bruit, ces codes permettent de conserver un taux d'erreur faible tout en augmentant le débit des transmissions et/ou en diminuant la puissance d'émission requise. Ceci contribue grandement à améliorer l'efficacité énergétique de ces dispositifs électroniques sans-fil et, ainsi, à prolonger leur autonomie. Dans cette thèse par articles, nous présentons un algorithme de recherche efficace pour trouver deux types de codes correcteurs d'erreur: les codes convolutionnels doublement orthogonaux (CDO) et les codes convolutionnels doublement orthogonaux simplifiés (S-CDO). En effet, ces codes sont utilisés dans un système de contrôle d'erreurs ayant un décodage à seuil itératif différent de la procédure de décodage Turbo classique, puisqu'il ne nécessite aucun entrelaceur, ni à l'encodage, ni aux étapes de décodage. Néanmoins, son processus de décodage à seuil nécessite que ces codes convolutionnels systématiques satisfassent des propriétés dites de « double orthogonalité », allant au-delà des conditions requises par les codes « simplement orthogonaux », bien connus et habituellement utilisés lors d'un décodage à seuil non-itératif. Afin de pouvoir construire des codecs à haute performance et à faible latence avec ces codes, il est important de minimiser leur longueur de contrainte ou « span » pour un nombre J de connexions donné. Bien que trouver des codes CDO et S-CDO ne soit pas difficile, déterminer les codes ayant un span minimal (dit optimal) pour un ordre J donné est mathématiquement très complexe. En effet, la construction directe de codes CDO / S-CDO à span court/optimal reste un problème ouvert et qui est soupçonné d'être NP-complet. Cette thèse présente un total de trois articles: deux articles publiés dans IEEE Transactions on Communications et un article soumis au journal IEEE Transactions on Parallel and Distributed Systems . Dans ces articles, nous décrivons un nouvel algorithme de recherche parallèle, efficace et implicitement-exhaustif pour trouver des codes CDO et S-CDO systématiques, à taux R=1/2 et ayant un span plus court, voire minimal, c.à.d. optimal. Comparé à l'algorithme de recherche implicitement-exhaustif de référence, l'algorithme de recherche à haute performance proposé reste exhaustif mais fournit un facteur d'accélération très important, supérieur à 16300 pour les codes CDO (J=7) et supérieur à 6300 pour les codes S-CDO (J=8).----------Abstract: In recent years, the rise of ultrabooks and mobile devices has been accompanied by an ever increasing need for reliable high-bandwidth wireless communications. To mitigate or eliminate the errors that are invariably introduced due to noise and interference in the communication channels, it is important to develop efficient error-correcting coding schemes. Indeed, these codes may be used to preserve the error performance while allowing the data-rate of digital communications to be increased and the transmission power at lower signal-to-noise ratios to be reduced, thereby improving the overall power efficiency of these devices. In this manuscript-based thesis, we present an efficient search algorithm for finding optimal/short-span Convolutional Self-Doubly Orthogonal (CDO) codes and Simplified Convolutional Self-Doubly Orthogonal (S-CDO) codes. These error-correcting codes are employed in an iterative error-control coding scheme that differs from the classical Turbo code procedure, as it does not require any interleaver, neither at the encoding nor at the decoding stages. However, its iterative threshold decoding procedure requires that these systematic convolutional codes satisfy some “double orthogonality properties”, beyond those of the well-known orthogonal codes used in the usual non-iterative threshold decoding. In order to build high-performance, low-latency codecs with these codes, it is important to minimize the constraint length, also called “span”, for a given number J of generator connections. Although finding CDO/S-CDO codes is not difficult, determining the optimal/short-span codes for a given order J is computationally very challenging. The direct construction of optimal or shortest-span CDO and S-CDO codes has so far eluded analysis, and the search for these codes is believed to be an NP-complete problem. The thesis presents a total of three articles: two articles that were published in IEEE Transactions on Communications , and one article that was submitted for publication to IEEE Transactions on Parallel and Distributed Systems . In these articles, we describe a novel efficient and parallel implicitly-exhaustive search algorithm for finding rate R=1/2 systematic optimal/short-span CDO and S-CDO codes. The high-performance search algorithm is still exhaustive in nature, yet it provides an impressive speedup that is larger than 16300 (CDO, J=7) and 6300 (S-CDO, J=8) over the reference implicitly-exhaustive search algorithm, and larger than 2000 (CDO, J=17) over the fastest known CDO validation function used in high-performance pseudo-random search algorithms

    Pseudo-Booleanilainen optimisaatio käyttäen implisiittisiä osumisjoukkoja

    Get PDF
    There are many computationally difficult problems where the task is to find a solution with the lowest cost possible that fulfills a given set of constraints. Such problems are often NP-hard and are encountered in a variety of real-world problem domains, including planning and scheduling. NP-hard problems are often solved using a declarative approach by encoding the problem into a declarative constraint language and solving the encoding using a generic algorithm for that language. In this thesis we focus on pseudo-Boolean optimization (PBO), a special class of integer programs (IP) that only contain variables that admit the values 0 and 1. We propose a novel approach to PBO that is based on the implicit hitting set (IHS) paradigm, which uses two separate components. An IP solver is used to find an optimal solution under an incomplete set of constraints. A pseudo-Boolean satisfiability solver is used to either validate the feasibility of the solution or to extract more constraints to the integer program. The IHS-based PBO algorithm iteratively invokes the two algorithms until an optimal solution to a given PBO instance is found. In this thesis we lay out the IHS-based PBO solving approach in detail. We implement the algorithm as the PBO-IHS solver by making use of recent advances in reasoning techniques for pseudo-Boolean constraints. Through extensive empirical evaluation we show that our PBO-IHS solver outperforms other available specialized PBO solvers and has complementary performance compared to classical integer programming techniques

    Identifying sources of global contention in constraint satisfaction search

    Get PDF
    Much work has been done on learning from failure in search to boost solving of combinatorial problems, such as clause-learning and clause-weighting in boolean satisfiability (SAT), nogood and explanation-based learning, and constraint weighting in constraint satisfaction problems (CSPs). Many of the top solvers in SAT use clause learning to good effect. A similar approach (nogood learning) has not had as large an impact in CSPs. Constraint weighting is a less fine-grained approach where the information learnt gives an approximation as to which variables may be the sources of greatest contention. In this work we present two methods for learning from search using restarts, in order to identify these critical variables prior to solving. Both methods are based on the conflict-directed heuristic (weighted-degree heuristic) introduced by Boussemart et al. and are aimed at producing a better-informed version of the heuristic by gathering information through restarting and probing of the search space prior to solving, while minimizing the overhead of these restarts. We further examine the impact of different sampling strategies and different measurements of contention, and assess different restarting strategies for the heuristic. Finally, two applications for constraint weighting are considered in detail: dynamic constraint satisfaction problems and unary resource scheduling problems
    corecore