9,081 research outputs found

    Parallel Genetic Algorithms with GPU Computing

    Get PDF
    Genetic algorithms (GAs) are powerful solutions to optimization problems arising from manufacturing and logistic fields. It helps to find better solutions for complex and difficult cases, which are hard to be solved by using strict optimization methods. Accelerating parallel GAs with GPU computing have received significant attention from both practitioners and researchers, ever since the emergence of GPU-CPU heterogeneous architectures. Designing a parallel algorithm on GPU is different fundamentally from designing one on CPU. On CPU architecture, typically data or tasks are distributed across tens of threads or processes, while on GPU architecture, more than hundreds of thousands of threads run. In order to fully utilize the computing power of GPUs, the design approaches and implementation strategies of parallel GAs should be re-probed. In the chapter, a concise overview of parallel GAs on GPU is given from the perspective of GPU architecture. The concept of parallelism granularity is redefined, the aspect of data layout is discussed on how it will affect the kernel performance, and the hierarchy of threads is examined on how threads are organized in the grid and blocks to expose sufficient parallelism to GPU. Some future research is discussed. A hybrid parallel model, based on the feature of GPU architecture, is suggested to build up efficient parallel GAs for hyper-scale problems

    A GPU-Computing Approach to Solar Stokes Profile Inversion

    Full text link
    We present a new computational approach to the inversion of solar photospheric Stokes polarization profiles, under the Milne-Eddington model, for vector magnetography. Our code, named GENESIS (GENEtic Stokes Inversion Strategy), employs multi-threaded parallel-processing techniques to harness the computing power of graphics processing units GPUs, along with algorithms designed to exploit the inherent parallelism of the Stokes inversion problem. Using a genetic algorithm (GA) engineered specifically for use with a GPU, we produce full-disc maps of the photospheric vector magnetic field from polarized spectral line observations recorded by the Synoptic Optical Long-term Investigations of the Sun (SOLIS) Vector Spectromagnetograph (VSM) instrument. We show the advantages of pairing a population-parallel genetic algorithm with data-parallel GPU-computing techniques, and present an overview of the Stokes inversion problem, including a description of our adaptation to the GPU-computing paradigm. Full-disc vector magnetograms derived by this method are shown, using SOLIS/VSM data observed on 2008 March 28 at 15:45 UT

    Accelerating Scientific Computing Models Using GPU Processing

    Get PDF
    GPGPUs offer significant computational power for programmers to leverage. This computational power is especially useful when utilized for accelerating scientific models. This thesis analyzes the utilization of GPGPU programming to accelerate scientific computing models. First the construction of hardware for visualization and computation of scientific models is discussed. Several factors in the construction of the machines focus on the performance impacts related to scientific modeling. Image processing is an embarrassingly parallel problem well suited for GPGPU acceleration. An image processing library was developed to show the processes of recognizing embarrassingly parallel problems and serves as an excellent example of converting from a serial CPU implementation to a GPU accelerated implementation. Genetic algorithms are biologically inspired heuristic search algorithms based on natural selection. The Tetris genetic algorithm with A* pathfinding discusses memory bound limitations that can prevent direct algorithm conversions from the CPU to the GPU. An analysis of an existing landscape evolution model, CHILD, for GPU acceleration explores that even when a model shows promise for GPU acceleration, the underlying data structures can have a significant impact upon that ability to move to a GPU implementation. CHILD also offers an example of creating tighter MATLAB integration between existing models. Lastly, a parallel spatial sorting algorithm is discussed as a possible replacement for current spatial sorting algorithms implemented in models such as smoothed particle hydrodynamics

    Genetic Algorithm Modeling with GPU Parallel Computing Technology

    Get PDF
    We present a multi-purpose genetic algorithm, designed and implemented with GPGPU / CUDA parallel computing technology. The model was derived from a multi-core CPU serial implementation, named GAME, already scientifically successfully tested and validated on astrophysical massive data classification problems, through a web application resource (DAMEWARE), specialized in data mining based on Machine Learning paradigms. Since genetic algorithms are inherently parallel, the GPGPU computing paradigm has provided an exploit of the internal training features of the model, permitting a strong optimization in terms of processing performances and scalability.Comment: 11 pages, 2 figures, refereed proceedings; Neural Nets and Surroundings, Proceedings of 22nd Italian Workshop on Neural Nets, WIRN 2012; Smart Innovation, Systems and Technologies, Vol. 19, Springe

    High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm

    Full text link
    We implement a master-slave parallel genetic algorithm (PGA) with a bespoke log-likelihood fitness function to identify emergent clusters within price evolutions. We use graphics processing units (GPUs) to implement a PGA and visualise the results using disjoint minimal spanning trees (MSTs). We demonstrate that our GPU PGA, implemented on a commercially available general purpose GPU, is able to recover stock clusters in sub-second speed, based on a subset of stocks in the South African market. This represents a pragmatic choice for low-cost, scalable parallel computing and is significantly faster than a prototype serial implementation in an optimised C-based fourth-generation programming language, although the results are not directly comparable due to compiler differences. Combined with fast online intraday correlation matrix estimation from high frequency data for cluster identification, the proposed implementation offers cost-effective, near-real-time risk assessment for financial practitioners.Comment: 10 pages, 5 figures, 4 tables, More thorough discussion of implementatio

    Solving the Uncapacitated Single Allocation p-Hub Median Problem on GPU

    Full text link
    A parallel genetic algorithm (GA) implemented on GPU clusters is proposed to solve the Uncapacitated Single Allocation p-Hub Median problem. The GA uses binary and integer encoding and genetic operators adapted to this problem. Our GA is improved by generated initial solution with hubs located at middle nodes. The obtained experimental results are compared with the best known solutions on all benchmarks on instances up to 1000 nodes. Furthermore, we solve our own randomly generated instances up to 6000 nodes. Our approach outperforms most well-known heuristics in terms of solution quality and time execution and it allows hitherto unsolved problems to be solved

    TensorFlow Enabled Genetic Programming

    Full text link
    Genetic Programming, a kind of evolutionary computation and machine learning algorithm, is shown to benefit significantly from the application of vectorized data and the TensorFlow numerical computation library on both CPU and GPU architectures. The open source, Python Karoo GP is employed for a series of 190 tests across 6 platforms, with real-world datasets ranging from 18 to 5.5M data points. This body of tests demonstrates that datasets measured in tens and hundreds of data points see 2-15x improvement when moving from the scalar/SymPy configuration to the vector/TensorFlow configuration, with a single core performing on par or better than multiple CPU cores and GPUs. A dataset composed of 90,000 data points demonstrates a single vector/TensorFlow CPU core performing 875x better than 40 scalar/Sympy CPU cores. And a dataset containing 5.5M data points sees GPU configurations out-performing CPU configurations on average by 1.3x.Comment: 8 pages, 5 figures; presented at GECCO 2017, Berlin, German

    A GPU-based Evolution Strategy for Optic Disk Detection in Retinal Images

    Get PDF
    La ejecución paralela de aplicaciones usando unidades de procesamiento gráfico (gpu) ha ganado gran interés en la comunidad académica en los años recientes. La computación paralela puede ser aplicada a las estrategias evolutivas para procesar individuos dentro de una población, sin embargo, las estrategias evolutivas se caracterizan por un significativo consumo de recursos computacionales al resolver problemas de gran tamaño o aquellos que se modelan mediante funciones de aptitud complejas. Este artículo describe la implementación de una estrategia evolutiva para la detección del disco óptico en imágenes de retina usando Compute Unified Device Architecture (cuda). Los resultados experimentales muestran que el tiempo de ejecución para la detección del disco óptico logra una aceleración de 5 a 7 veces, comparado con la ejecución secuencial en una cpu convencional.Parallel processing using graphic processing units (GPUs) has attracted much research interest in recent years. Parallel computation can be applied to evolution strategy (ES) for processing individuals in a population, but evolutionary strategies are time consuming to solve large computational problems or complex fitness functions. In this paper we describe the implementation of an improved ES for optic disk detection in retinal images using the Compute Unified Device Architecture (CUDA) environment. In the experimental results we show that the computational time for optic disk detection task has a speedup factor of 5x and 7x compared to an implementation on a mainstream CPU
    corecore