514 research outputs found

    Green Parallel Metaheuristics: Design, Implementation, and Evaluation

    Get PDF
    Fecha de lectura de Tesis Doctoral 14 mayo 2020Green parallel metaheuristics (GPM) is a new concept we want to introduce in this thesis. It is an idea inspired by two facts: (i) parallel metaheuristics could help as unique tools to solve optimization problems in energy savings applications and sustainability, and (ii) these algorithms themselves run on multiprocessors, clusters, and grids of computers and then consume energy, so they need an energy analysis study for their different implementations over multiprocessors. The context for this thesis is to make a modern and competitive effort to extend the capability of present intelligent search optimization techniques. Analyzing the different sequential and parallel metaheuristics considering its energy consumption requires a deep investigation of the numerical performance, the execution time for efficient future designing to these algorithms. We present a study of the speed-up of the different parallel implementations over a different number of computing units. Moreover, we analyze and compare the energy consumption and numerical performance of the sequential/parallel algorithms and their components: a jump in the efficiency of the algorithms that would probably have a wide impact on the domains involved.El Instituto Egipcio en Madrid, dependiente del Gobierno de Egipto

    A GPU-based Evolution Strategy for Optic Disk Detection in Retinal Images

    Get PDF
    La ejecución paralela de aplicaciones usando unidades de procesamiento gráfico (gpu) ha ganado gran interés en la comunidad académica en los años recientes. La computación paralela puede ser aplicada a las estrategias evolutivas para procesar individuos dentro de una población, sin embargo, las estrategias evolutivas se caracterizan por un significativo consumo de recursos computacionales al resolver problemas de gran tamaño o aquellos que se modelan mediante funciones de aptitud complejas. Este artículo describe la implementación de una estrategia evolutiva para la detección del disco óptico en imágenes de retina usando Compute Unified Device Architecture (cuda). Los resultados experimentales muestran que el tiempo de ejecución para la detección del disco óptico logra una aceleración de 5 a 7 veces, comparado con la ejecución secuencial en una cpu convencional.Parallel processing using graphic processing units (GPUs) has attracted much research interest in recent years. Parallel computation can be applied to evolution strategy (ES) for processing individuals in a population, but evolutionary strategies are time consuming to solve large computational problems or complex fitness functions. In this paper we describe the implementation of an improved ES for optic disk detection in retinal images using the Compute Unified Device Architecture (CUDA) environment. In the experimental results we show that the computational time for optic disk detection task has a speedup factor of 5x and 7x compared to an implementation on a mainstream CPU

    Parallel genetic algorithms: a feasible distributed : Implementation

    Get PDF
    Parallel genetic algorithms, models and implementations, attempts to exploit the intrinsically parallel nature of genetic algorithms. By distributing the total population, these models ref1ects a bebaviour nearer to that of natural systems. A variety of parallel computer systems architectures can offer distinct support features for their implementation. Ibis paper shows sorne remarkable characteristics of parallel genetic algorithms, details of a feasible design and their implementation. A1so some results related to the island model are shown.Eje: Redes Neuronales. Algoritmos genéticosRed de Universidades con Carreras en Informática (RedUNCI

    Parallel genetic algorithms: a feasible distributed : Implementation

    Get PDF
    Parallel genetic algorithms, models and implementations, attempts to exploit the intrinsically parallel nature of genetic algorithms. By distributing the total population, these models ref1ects a bebaviour nearer to that of natural systems. A variety of parallel computer systems architectures can offer distinct support features for their implementation. Ibis paper shows sorne remarkable characteristics of parallel genetic algorithms, details of a feasible design and their implementation. A1so some results related to the island model are shown.Eje: Redes Neuronales. Algoritmos genéticosRed de Universidades con Carreras en Informática (RedUNCI

    Relaxing Synchronization in Distributed Simulated Annealing

    Get PDF
    Simulated annealing is an attractive, but expensive, heuristic for approximating the solution to combinatorial optimization problems. Since simulated annealing is a general purpose method, it can be applied to the broad range of NP-complete problems such as the traveling salesman problem, graph theory, and cell placement with a careful control of the cooling schedule. Attempts to parallelize simulated annealing, particularly on distributed memory multicomputers, are hampered by the algorithm’s requirement of a globally consistent system state. In a multicomputer, maintaining the global state S involves explicit message traffic and is a critical performance bottleneck. One way to mitigate this bottleneck is to amortize the overhead of these state updates over as many parallel state changes as possible. By using this technique, errors in the actual cost C(S) of a particular state S will be introduced into the annealing process. This dissertation places analytically derived bounds on the cost error in order to assure convergence to the correct result. The resulting parallel Simulated Annealing algorithm dynamically changes the frequency of global updates as a function of the annealing control parameter, i.e. temperature. Implementation results on an Intel iPSC/2 are reported

    High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm

    Full text link
    We implement a master-slave parallel genetic algorithm (PGA) with a bespoke log-likelihood fitness function to identify emergent clusters within price evolutions. We use graphics processing units (GPUs) to implement a PGA and visualise the results using disjoint minimal spanning trees (MSTs). We demonstrate that our GPU PGA, implemented on a commercially available general purpose GPU, is able to recover stock clusters in sub-second speed, based on a subset of stocks in the South African market. This represents a pragmatic choice for low-cost, scalable parallel computing and is significantly faster than a prototype serial implementation in an optimised C-based fourth-generation programming language, although the results are not directly comparable due to compiler differences. Combined with fast online intraday correlation matrix estimation from high frequency data for cluster identification, the proposed implementation offers cost-effective, near-real-time risk assessment for financial practitioners.Comment: 10 pages, 5 figures, 4 tables, More thorough discussion of implementatio

    Cluster Computing in the Classroom: Topics, Guidelines, and Experiences

    Get PDF
    With the progress of research on cluster computing, more and more universities have begun to offer various courses covering cluster computing. A wide variety of content can be taught in these courses. Because of this, a difficulty that arises is the selection of appropriate course material. The selection is complicated by the fact that some content in cluster computing is also covered by other courses such as operating systems, networking, or computer architecture. In addition, the background of students enrolled in cluster computing courses varies. These aspects of cluster computing make the development of good course material difficult. Combining our experiences in teaching cluster computing in several universities in the USA and Australia and conducting tutorials at many international conferences all over the world, we present prospective topics in cluster computing along with a wide variety of information sources (books, software, and materials on the web) from which instructors can choose. The course material described includes system architecture, parallel programming, algorithms, and applications. Instructors are advised to choose selected units in each of the topical areas and develop their own syllabus to meet course objectives. For example, a full course can be taught on system architecture for core computer science students. Or, a course on parallel programming could contain a brief coverage of system architecture and then devote the majority of time to programming methods. Other combinations are also possible. We share our experiences in teaching cluster computing and the topics we have chosen depending on course objectives

    Computing and communications for the software-defined metamaterial paradigm: a context analysis

    Get PDF
    Metamaterials are artificial structures that have recently enabled the realization of novel electromagnetic components with engineered and even unnatural functionalities. Existing metamaterials are specifically designed for a single application working under preset conditions (e.g., electromagnetic cloaking for a fixed angle of incidence) and cannot be reused. Software-defined metamaterials (SDMs) are a much sought-after paradigm shift, exhibiting electromagnetic properties that can be reconfigured at runtime using a set of software primitives. To enable this new technology, SDMs require the integration of a network of controllers within the structure of the metamaterial, where each controller interacts locally and communicates globally to obtain the programmed behavior. The design approach for such controllers and the interconnection network, however, remains unclear due to the unique combination of constraints and requirements of the scenario. To bridge this gap, this paper aims to provide a context analysis from the computation and communication perspectives. Then, analogies are drawn between the SDM scenario and other applications both at the micro and nano scales, identifying possible candidates for the implementation of the controllers and the intra-SDM network. Finally, the main challenges of SDMs related to computing and communications are outlined.Peer ReviewedPostprint (published version

    Parallel genetic algorithms: a feasible distributed : Implementation

    Get PDF
    Parallel genetic algorithms, models and implementations, attempts to exploit the intrinsically parallel nature of genetic algorithms. By distributing the total population, these models ref1ects a bebaviour nearer to that of natural systems. A variety of parallel computer systems architectures can offer distinct support features for their implementation. Ibis paper shows sorne remarkable characteristics of parallel genetic algorithms, details of a feasible design and their implementation. A1so some results related to the island model are shown.Eje: Redes Neuronales. Algoritmos genéticosRed de Universidades con Carreras en Informática (RedUNCI

    Parallel Computers and Complex Systems

    Get PDF
    We present an overview of the state of the art and future trends in high performance parallel and distributed computing, and discuss techniques for using such computers in the simulation of complex problems in computational science. The use of high performance parallel computers can help improve our understanding of complex systems, and the converse is also true --- we can apply techniques used for the study of complex systems to improve our understanding of parallel computing. We consider parallel computing as the mapping of one complex system --- typically a model of the world --- into another complex system --- the parallel computer. We study static, dynamic, spatial and temporal properties of both the complex systems and the map between them. The result is a better understanding of which computer architectures are good for which problems, and of software structure, automatic partitioning of data, and the performance of parallel machines
    corecore