Search CORE

164 research outputs found

Doctor of Philosophy

Author: King James Sokhom
Publication venue: University of Utah
Publication date: 01/01/2017
Field of study

dissertationMemory access irregularities are a major bottleneck for bandwidth limited problems on Graphics Processing Unit (GPU) architectures. GPU memory systems are designed to allow consecutive memory accesses to be coalesced into a single memory access. Noncontiguous accesses within a parallel group of threads working in lock step may cause serialized memory transfers. Irregular algorithms may have data-dependent control flow and memory access, which requires runtime information to be evaluated. Compile time methods for evaluating parallelism, such as static dependence graphs, are not capable of evaluating irregular algorithms. The goals of this dissertation are to study irregularities within the context of unstructured mesh and sparse matrix problems, analyze the impact of vectorization widths on irregularities, and present data-centric methods that improve control flow and memory access irregularity within those contexts. Reordering associative operations has often been exploited for performance gains in parallel algorithms. This dissertation presents a method for associative reordering of stencil computations over unstructured meshes that increases data reuse through caching. This novel parallelization scheme offers considerable speedups over standard methods. Vectorization widths can have significant impact on performance in vectorized computations. Although the hardware vector width is generally fixed, the logical vector width used within a computation can range from one up to the width of the computation. Significant performance differences can occur due to thread scheduling and resource limitations. This dissertation analyzes the impact of vectorization widths on dense numerical computations such as 3D dG postprocessing. It is difficult to efficiently perform dynamic updates on traditional sparse matrix formats. Explicitly controlling memory segmentation allows for in-place dynamic updates in sparse matrices. Dynamically updating the matrix without rebuilding or sorting greatly improves processing time and overall throughput. This dissertation presents a new sparse matrix format, dynamic compressed sparse row (DCSR), which allows for dynamic streaming updates to a sparse matrix. A new method for parallel sparse matrix-matrix multiplication (SpMM) that uses dynamic updates is also presented

Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm

Author: Bendjoudi Ahcène
Chakroun Imen
Melab Nouredine
Mezmaz Mohand
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

International audienceIn this paper, we address the design and implementation of GPU-accelerated Branch-and-Bound algorithms (B&B) for solving Flow-shop scheduling optimization problems (FSP). Such applications are CPU-time consuming and highly irregular. On the other hand, GPUs are massively multi-threaded accelerators using the SIMD model at execution. A major issue which arises when executing on GPU a B&B applied to FSP is thread or branch divergence. Such divergence is caused by the lower bound function of FSP which contains many irregular loops and conditional instructions. Our challenge is therefore to revisit the design and implementation of B&B applied to FSP dealing with thread divergence. Extensive experiments of the proposed approach have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based execution, accelerations up to ×77.46 are achieved for large problem instances

HAL - Lille 3

INRIA a CCSD electronic archive server

Hal-Diderot

Treatment of Synchronizations in Compiling Fine-Grained SPMD-Threaded Programs for CPU

Author: Guo Ziyu
Publication venue: W&M ScholarWorks
Publication date: 01/01/2011
Field of study

College of William & Mary: W&M Publish

Consulting project for TAP´s melhoria contínua area to increase operational efficiency at Lisbon HUB

Author: Andrade Carolina Maria Morais Cardoso Freire de
Coimbra Maria Inês Forjaz Morão Dias
Martins Francisco Maria Cavaleiro Gonçalves Pereira
Miranda Augusto José Casalta
Sampaio Francisco Gil Ferreira
Publication venue
Publication date: 16/01/2014
Field of study

The following report is destined to shortly describe the project developed under NOVA SBE’s Management Consulting Field Labs initiative. five NOVA SBE’s students for the airline company TAP on its melhoria contínua area. The objectives were to reduce TAP’s operational irregularities Minimum Connecting Time (MCT) in approximately 15 minutes at Lisbon Airport. In order to find the solution to the mentioned challenges the team adopted a practical work approach that proved to have a final positive impact in the company, namely the implementation of recommendations for operational irregularities would sav