500 research outputs found

    Brook Auto: High-Level Certification-Friendly Programming for GPU-powered Automotive Systems

    Get PDF
    Modern automotive systems require increased performance to implement Advanced Driving Assistance Systems (ADAS). GPU-powered platforms are promising candidates for such computational tasks, however current low-level programming models challenge the accelerator software certification process, while they limit the hardware selection to a fraction of the available platforms. In this paper we present Brook Auto, a high-level programming language for automotive GPU systems which removes these limitations. We describe the challenges and solutions we faced in its implementation, as well as a complete evaluation in terms of performance and productivity, which shows the effectiveness of our method.This work has been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316-P and the HiPEAC Network of Excellence.Peer ReviewedPostprint (author's final draft

    Mixing multi-core CPUs and GPUs for scientific simulation software

    Get PDF
    Recent technological and economic developments have led to widespread availability of multi-core CPUs and specialist accelerator processors such as graphical processing units (GPUs). The accelerated computational performance possible from these devices can be very high for some applications paradigms. Software languages and systems such as NVIDIA's CUDA and Khronos consortium's open compute language (OpenCL) support a number of individual parallel application programming paradigms. To scale up the performance of some complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica- tions using threading approaches and multi-core CPUs to control independent GPU devices. We present speed-up data and discuss multi-threading software issues for the applications level programmer and o er some suggested areas for language development and integration between coarse-grained and ne-grained multi-thread systems. We discuss results from three common simulation algorithmic areas including: partial di erential equations; graph cluster metric calculations and random number generation. We report on programming experiences and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs; a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and trends in multi-core programming for scienti c applications developers

    Real-Time GPS-Alternative Navigation Using Commodity Hardware

    Get PDF
    Modern navigation systems can use the Global Positioning System (GPS) to accurately determine position with precision in some cases bordering on millimeters. Unfortunately, GPS technology is susceptible to jamming, interception, and unavailability indoors or underground. There are several navigation techniques that can be used to navigate during times of GPS unavailability, but there are very few that result in GPS-level precision. One method of achieving high precision navigation without GPS is to fuse data obtained from multiple sensors. This thesis explores the fusion of imaging and inertial sensors and implements them in a real-time system that mimics human navigation. In addition, programmable graphics processing unit technology is leveraged to perform stream-based image processing using a computer\u27s video card. The resulting system can perform complex mathematical computations in a fraction of the time those same operations would take on a CPU-based platform. The resulting system is an adaptable, portable, inexpensive and self-contained software and hardware platform, which paves the way for advances in autonomous navigation, mobile cartography, and artificial intelligence

    Development of a GPGPU accelerated tool to simulate advection-reaction-diffusion phenomena in 2D

    Get PDF
    Computational models are powerful tools to the study of environmental systems, playing a fundamental role in several fields of research (hydrological sciences, biomathematics, atmospheric sciences, geosciences, among others). Most of these models require high computational capacity, especially when one considers high spatial resolution and the application to large areas. In this context, the exponential increase in computational power brought by General Purpose Graphics Processing Units (GPGPU) has drawn the attention of scientists and engineers to the development of low cost and high performance parallel implementations of environmental models. In this research, we apply GPGPU computing for the development of a model that describes the physical processes of advection, reaction and diffusion. This presentation is held in the form of three self-contained articles. In the first one, we present a GPGPU implementation for the solution of the 2D groundwater flow equation in unconfined aquifers for heterogenous and anisotropic media. We implement a finite difference solution scheme based on the Crank- Nicolson method and show that the GPGPU accelerated solution implemented using CUDA C/C++ (Compute Unified Device Architecture) greatly outperforms the corresponding serial solution implemented in C/C++. The results show that accelerated GPGPU implementation is capable of delivering up to 56 times acceleration in the solution process using an ordinary office computer. In the second article, we study the application of a diffusive-logistic growth (DLG) model to the problem of forest growth and regeneration. The study focuses on vegetation belonging to preservation areas, such as riparian buffer zones. The study was developed in two stages: (i) a methodology based on Artificial Neural Network Ensembles (ANNE) was applied to evaluate the width of riparian buffer required to filter 90% of the residual nitrogen; (ii) the DLG model was calibrated and validated to generate a prognostic of forest regeneration in riparian protection bands considering the minimum widths indicated by the ANNE. The solution was implemented in GPGPU and it was applied to simulate the forest regeneration process for forty years on the riparian protection bands along the Ligeiro river, in Brazil. The results from calibration and validation showed that the DLG model provides fairly accurate results for the modelling of forest regeneration. In the third manuscript, we present a GPGPU implementation of the solution of the advection-reaction-diffusion equation in 2D. The implementation is designed to be general and flexible to allow the modeling of a wide range of processes, including those with heterogeneity and anisotropy. We show that simulations performed in GPGPU allow the use of mesh grids containing more than 20 million points, corresponding to an area of 18,000 km? in a standard Landsat image resolution.Os modelos computacionais s?o ferramentas poderosas para o estudo de sistemas ambientais, desempenhando um papel fundamental em v?rios campos de pesquisa (ci?ncias hidrol?gicas, biomatem?tica, ci?ncias atmosf?ricas, geoci?ncias, entre outros). A maioria desses modelos requer alta capacidade computacional, especialmente quando se considera uma alta resolu??o espacial e a aplica??o em grandes ?reas. Neste contexto, o aumento exponencial do poder computacional trazido pelas Unidades de Processamento de Gr?ficos de Prop?sito Geral (GPGPU) chamou a aten??o de cientistas e engenheiros para o desenvolvimento de implementa??es paralelas de baixo custo e alto desempenho para modelos ambientais. Neste trabalho, aplicamos computa??o em GPGPU para o desenvolvimento de um modelo que descreve os processos f?sicos de advec??o, rea??o e difus?o. Esta disserta??o ? apresentada sob a forma de tr?s artigos. No primeiro, apresentamos uma implementa??o em GPGPU para a solu??o da equa??o de fluxo de ?guas subterr?neas 2D em aqu?feros n?o confinados para meios heterog?neos e anisotr?picos. Foi implementado um esquema de solu??o de diferen?as finitas com base no m?todo Crank- Nicolson e mostramos que a solu??o acelerada GPGPU implementada usando CUDA C / C ++ supera a solu??o serial correspondente implementada em C / C ++. Os resultados mostram que a implementa??o acelerada por GPGPU ? capaz de fornecer acelera??o de at? 56 vezes no processo da solu??o usando um computador de escrit?rio comum. No segundo artigo estudamos a aplica??o de um modelo de crescimento log?stico difusivo (DLG) ao problema de crescimento e regenera??o florestal. O estudo foi desenvolvido em duas etapas: (i) Aplicou-se uma metodologia baseada em Comites de Rede Neural Artificial (ANNE) para avaliar a largura da faixa de prote??o rip?ria necess?ria para filtrar 90% do nitrog?nio residual; (ii) O modelo DLG foi calibrado e validado para gerar um progn?stico de regenera??o florestal em faixas de prote??o rip?rias considerando as larguras m?nimas indicadas pela ANNE. A solu??o foi implementada em GPGPU e aplicada para simular o processo de regenera??o florestal para um per?odo de quarenta anos na faixa de prote??o rip?ria ao longo do rio Ligeiro, no Brasil. Os resultados da calibra??o e valida??o mostraram que o modelo DLG fornece resultados bastante precisos para a modelagem de regenera??o florestal. No terceiro artigo, apresenta-se uma implementa??o em GPGPU para solu??o da equa??o advec??o-rea??o-difus?o em 2D. A implementa??o ? projetada para ser geral e flex?vel para permitir a modelagem de uma ampla gama de processos, incluindo caracter?sticas como heterogeneidade e anisotropia do meio. Neste trabalho mostra-se que as simula??es realizadas em GPGPU permitem o uso de malhas contendo mais de 20 milh?es de pontos (vari?veis), correspondendo a uma ?rea de 18.000 km? em resolu??o de 30m padr?o das imagens Landsat

    PC-grade parallel processing and hardware acceleration for large-scale data analysis

    Get PDF
    Arguably, modern graphics processing units (GPU) are the first commodity, and desktop parallel processor. Although GPU programming was originated from the interactive rendering in graphical applications such as computer games, researchers in the field of general purpose computation on GPU (GPGPU) are showing that the power, ubiquity and low cost of GPUs makes them an ideal alternative platform for high-performance computing. This has resulted in the extensive exploration in using the GPU to accelerate general-purpose computations in many engineering and mathematical domains outside of graphics. However, limited to the development complexity caused by the graphics-oriented concepts and development tools for GPU-programming, GPGPU has mainly been discussed in the academic domain so far and has not yet fully fulfilled its promises in the real world. This thesis aims at exploiting GPGPU in the practical engineering domain and presented a novel contribution to GPGPU-driven linear time invariant (LTI) systems that are employed by the signal processing techniques in stylus-based or optical-based surface metrology and data processing. The core contributions that have been achieved in this project can be summarized as follow. Firstly, a thorough survey of the state-of-the-art of GPGPU applications and their development approaches has been carried out in this thesis. In addition, the category of parallel architecture pattern that the GPGPU belongs to has been specified, which formed the foundation of the GPGPU programming framework design in the thesis. Following this specification, a GPGPU programming framework is deduced as a general guideline to the various GPGPU programming models that are applied to a large diversity of algorithms in scientific computing and engineering applications. Considering the evolution of GPU’s hardware architecture, the proposed frameworks cover through the transition of graphics-originated concepts for GPGPU programming based on legacy GPUs and the abstraction of stream processing pattern represented by the compute unified device architecture (CUDA) in which GPU is considered as not only a graphics device but a streaming coprocessor of CPU. Secondly, the proposed GPGPU programming framework are applied to the practical engineering applications, namely, the surface metrological data processing and image processing, to generate the programming models that aim to carry out parallel computing for the corresponding algorithms. The acceleration performance of these models are evaluated in terms of the speed-up factor and the data accuracy, which enabled the generation of quantifiable benchmarks for evaluating consumer-grade parallel processors. It shows that the GPGPU applications outperform the CPU solutions by up to 20 times without significant loss of data accuracy and any noticeable increase in source code complexity, which further validates the effectiveness of the proposed GPGPU general programming framework. Thirdly, this thesis devised methods for carrying out result visualization directly on GPU by storing processed data in local GPU memory through making use of GPU’s rendering device features to achieve realtime interactions. The algorithms employed in this thesis included various filtering techniques, discrete wavelet transform, and the fast Fourier Transform which cover the common operations implemented in most LTI systems in spatial and frequency domains. Considering the employed GPUs’ hardware designs, especially the structure of the rendering pipelines, and the characteristics of the algorithms, the series of proposed GPGPU programming models have proven its feasibility, practicality, and robustness in real engineering applications. The developed GPGPU programming framework as well as the programming models are anticipated to be adaptable for future consumer-level computing devices and other computational demanding applications. In addition, it is envisaged that the devised principles and methods in the framework design are likely to have significant benefits outside the sphere of surface metrology.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Analyzing CUDA workloads using a detailed GPU simulator

    Full text link
    corecore