375 research outputs found

    Parallel Genetic Algorithms with GPU Computing

    Get PDF
    Genetic algorithms (GAs) are powerful solutions to optimization problems arising from manufacturing and logistic fields. It helps to find better solutions for complex and difficult cases, which are hard to be solved by using strict optimization methods. Accelerating parallel GAs with GPU computing have received significant attention from both practitioners and researchers, ever since the emergence of GPU-CPU heterogeneous architectures. Designing a parallel algorithm on GPU is different fundamentally from designing one on CPU. On CPU architecture, typically data or tasks are distributed across tens of threads or processes, while on GPU architecture, more than hundreds of thousands of threads run. In order to fully utilize the computing power of GPUs, the design approaches and implementation strategies of parallel GAs should be re-probed. In the chapter, a concise overview of parallel GAs on GPU is given from the perspective of GPU architecture. The concept of parallelism granularity is redefined, the aspect of data layout is discussed on how it will affect the kernel performance, and the hierarchy of threads is examined on how threads are organized in the grid and blocks to expose sufficient parallelism to GPU. Some future research is discussed. A hybrid parallel model, based on the feature of GPU architecture, is suggested to build up efficient parallel GAs for hyper-scale problems

    Mixing multi-core CPUs and GPUs for scientific simulation software

    Get PDF
    Recent technological and economic developments have led to widespread availability of multi-core CPUs and specialist accelerator processors such as graphical processing units (GPUs). The accelerated computational performance possible from these devices can be very high for some applications paradigms. Software languages and systems such as NVIDIA's CUDA and Khronos consortium's open compute language (OpenCL) support a number of individual parallel application programming paradigms. To scale up the performance of some complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica- tions using threading approaches and multi-core CPUs to control independent GPU devices. We present speed-up data and discuss multi-threading software issues for the applications level programmer and o er some suggested areas for language development and integration between coarse-grained and ne-grained multi-thread systems. We discuss results from three common simulation algorithmic areas including: partial di erential equations; graph cluster metric calculations and random number generation. We report on programming experiences and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs; a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and trends in multi-core programming for scienti c applications developers

    Implementing genetic algorithms to CUDA environment using data parallelization

    Get PDF
    Računarske metode rješavanja paralelnih problema korištenjem grafičkih obradnih jedinica (GPUs) zadnjih su godina pobudile veliki interes. Paralelno izračunavanje može se primijeniti na genetske algoritme (GAs) u odnosu na proces evaluacije jedinki u populaciji. Ovaj rad opisuje još jednu metodu primjene GAs na CUDA okruženje gdje je CUDA računarsko okruženje opće namjene za GPUs koje daje NVIDIA. Osnovna karakteristika ovog istraživanja leži u tome da se paralelna obrada koristi ne samo za jedinke nego i za gene u jedinki. Predložena implementacija se procjenjuje kroz osam ispitnih funkcija. Ustanovili smo da predložena metoda implementacije daje 7,6-18,4 puta brže rezultate od onih kod primjene CPU.Computation methods of parallel problem solving using graphic processing units (GPUs) have attracted much research interests in recent years. Parallel computation can be applied to genetic algorithms (GAs) in terms of the evaluation process of individuals in a population. This paper describes yet another implementation method of GAs to the CUDA environment where CUDA is a general-purpose computation environment for GPUs provided by NVIDIA. The major characteristic point of this study is that the parallel processing is adopted not only for individuals but also for the genes in an individual. The proposed implementation is evaluated through eight test functions. We found that the proposed implementation method yields 7,6-18,4 times faster results than those of a CPU implementation

    Implementing genetic algorithms to CUDA environment using data parallelization

    Get PDF
    Računarske metode rješavanja paralelnih problema korištenjem grafičkih obradnih jedinica (GPUs) zadnjih su godina pobudile veliki interes. Paralelno izračunavanje može se primijeniti na genetske algoritme (GAs) u odnosu na proces evaluacije jedinki u populaciji. Ovaj rad opisuje još jednu metodu primjene GAs na CUDA okruženje gdje je CUDA računarsko okruženje opće namjene za GPUs koje daje NVIDIA. Osnovna karakteristika ovog istraživanja leži u tome da se paralelna obrada koristi ne samo za jedinke nego i za gene u jedinki. Predložena implementacija se procjenjuje kroz osam ispitnih funkcija. Ustanovili smo da predložena metoda implementacije daje 7,6-18,4 puta brže rezultate od onih kod primjene CPU.Computation methods of parallel problem solving using graphic processing units (GPUs) have attracted much research interests in recent years. Parallel computation can be applied to genetic algorithms (GAs) in terms of the evaluation process of individuals in a population. This paper describes yet another implementation method of GAs to the CUDA environment where CUDA is a general-purpose computation environment for GPUs provided by NVIDIA. The major characteristic point of this study is that the parallel processing is adopted not only for individuals but also for the genes in an individual. The proposed implementation is evaluated through eight test functions. We found that the proposed implementation method yields 7,6-18,4 times faster results than those of a CPU implementation

    Agent-based modelling and Swarm Intelligence in systems engineering

    Get PDF
    El objetivo de la tesis doctoral es evaluar la utilidad de las técnicas Modelado Basado en Agentes, algoritmos de optimización Swarm Intelligence y programación paralela sobre tarjeta gráfica en el campo de la Ingeniería de Sistemas y Automática. Se ha realizado un revisión bibliográfica y desarrollado un marco de desarrollo de la técnica de Modelado Basado en Agentes. Esta técnica se ha empleado para realizar un modelo de un reactor de fangos activados (que se engloba dentro del proceso de depuración de aguas residuales). Se ha desarrollado una notación complementaria para la descripción de modelos basados en agentes desde el punto de vista de la ingeniería de sistemas. Se ha presentado asimismo un algoritmo de optimización basado en agentes bajo la filosofía Swarm Intelligence. Se han trabajado con las técnicas de paralelización sobre tarjeta gráfica para reducir los tiempos de simulación de modelos y algoritmos. Se trata por lo tanto de un tesis de integración de varias tecnologías.Departamento de Ingeniería de Sistemas y Automátic

    A Performance/Cost Model for a CUDA Drug Discovery Application on Physical and Public Cloud Infrastructures

    Get PDF
    Virtual Screening (VS) methods can considerably aid drug discovery research, predicting how ligands interact with drug targets. BINDSURF is an efficient and fast blind VS methodology for the determination of protein binding sites, depending on the ligand, using the massively parallel architecture of graphics processing units(GPUs) for fast unbiased prescreening of large ligand databases. In this contribution, we provide a performance/cost model for the execution of this application on both local system and public cloud infrastructures. With our model, it is possible to determine which is the best infrastructure to use in terms of execution time and costs for any given problem to be solved by BINDSURF. Conclusions obtained from our study can be extrapolated to other GPU‐based VS methodologiesIngeniería, Industria y Construcció

    HPAC-Offload: Accelerating HPC Applications with Portable Approximate Computing on the GPU

    Full text link
    The end of Dennard scaling and the slowdown of Moore's law led to a shift in technology trends toward parallel architectures, particularly in HPC systems. To continue providing performance benefits, HPC should embrace Approximate Computing (AC), which trades application quality loss for improved performance. However, existing AC techniques have not been extensively applied and evaluated in state-of-the-art hardware architectures such as GPUs, the primary execution vehicle for HPC applications today. This paper presents HPAC-Offload, a pragma-based programming model that extends OpenMP offload applications to support AC techniques, allowing portable approximations across different GPU architectures. We conduct a comprehensive performance analysis of HPAC-Offload across GPU-accelerated HPC applications, revealing that AC techniques can significantly accelerate HPC applications (1.64x LULESH on AMD, 1.57x NVIDIA) with minimal quality loss (0.1%). Our analysis offers deep insights into the performance of GPU-based AC that guide the future development of AC algorithms and systems for these architectures.Comment: 12 pages, 12 pages. Accepted at SC2

    Natural ventilation design attributes application effect on, indoor natural ventilation performance of a double storey, single unit residential building

    Get PDF
    In establishing a good indoor thermal condition, air movement is one of the important parameter to be considered to provide indoor fresh air for occupants. Due to the public awareness on environment impact, people has been increasingly attentive to passive design in achieving good condition of indoor building ventilation. Throughout case studies, significant building attributes were found giving effect on building indoor natural ventilation performance. The studies were categorized under vernacular houses, contemporary houses with vernacular element and contemporary houses. The indoor air movement of every each spaces in the houses were compared with the outdoor air movement surrounding the houses to indicate the space’s indoor natural ventilation performance. Analysis found the wind catcher element appears to be the most significant attribute to contribute most to indoor natural ventilation. Wide opening was also found to be significant especially those with louvers. Whereas it is also interesting to find indoor layout design is also significantly giving impact on the performance. The finding indicates that a good indoor natural ventilation is not only dictated by having proper openings at proper location of a building, but also on how the incoming air movement is managed throughout the interior spaces by proper layout. Understanding on the air pressure distribution caused by indoor windward and leeward side is important in directing the air flow to desired spaces in producing an overall good indoor natural ventilation performance

    Integrative multicellular biological modeling: a case study of 3D epidermal development using GPU algorithms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Simulation of sophisticated biological models requires considerable computational power. These models typically integrate together numerous biological phenomena such as spatially-explicit heterogeneous cells, cell-cell interactions, cell-environment interactions and intracellular gene networks. The recent advent of programming for graphical processing units (GPU) opens up the possibility of developing more integrative, detailed and predictive biological models while at the same time decreasing the computational cost to simulate those models.</p> <p>Results</p> <p>We construct a 3D model of epidermal development and provide a set of GPU algorithms that executes significantly faster than sequential central processing unit (CPU) code. We provide a parallel implementation of the subcellular element method for individual cells residing in a lattice-free spatial environment. Each cell in our epidermal model includes an internal gene network, which integrates cellular interaction of Notch signaling together with environmental interaction of basement membrane adhesion, to specify cellular state and behaviors such as growth and division. We take a pedagogical approach to describing how modeling methods are efficiently implemented on the GPU including memory layout of data structures and functional decomposition. We discuss various programmatic issues and provide a set of design guidelines for GPU programming that are instructive to avoid common pitfalls as well as to extract performance from the GPU architecture.</p> <p>Conclusions</p> <p>We demonstrate that GPU algorithms represent a significant technological advance for the simulation of complex biological models. We further demonstrate with our epidermal model that the integration of multiple complex modeling methods for heterogeneous multicellular biological processes is both feasible and computationally tractable using this new technology. We hope that the provided algorithms and source code will be a starting point for modelers to develop their own GPU implementations, and encourage others to implement their modeling methods on the GPU and to make that code available to the wider community.</p
    corecore