4 research outputs found

    A framework for scientific computing with GPUs

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaCommodity hardware nowadays includes not only many-core CPUs but also Graphics Processing Units (GPUs) whose highly data-parallel computational capabilities have been growing at an exponential rate. This computational power can be used for purposes other than graphics-oriented applications, like processor-intensive algorithms as found in the scientific computing setting. This thesis proposes a framework that is capable of distributing computational jobs over a network of CPUs and GPUs alike. The source code for each job is an OpenCL kernel, and thus universal and independent from the specific architecture and CPU/GPU type where it will be executed. This approach releases the software developer from the burden of specific, customized revisions of the same applications for each type of processor/hardware, at the cost of a possibly sub-optimal but still very efficient solution. The proposed run-time scales up as more and more powerful computing resources become available, with no need to recompile the application. Experiments allowed to conclude that, although performance improvement achievements clearly depend on the nature of the problem and how it is coded, speedups in a distributed system containing both GPUs and multi-core CPUs can be up to two orders of magnitude.Centro de Informática e Tecnologias da Informação(CITI), and Fundação para a Ciência e Tecnologia (FCT/MCTES)- research projects PTDC/EIA/74325/2006, PTDC/EIA-EIA/108963/2008, PTDC/EIA-EIA /102579/2008, and PTDC/EIA-EIA/113613/200

    Plasmons in nanoparticles: atomistic Ab Initio theory for large systems

    Get PDF
    205 p.El trabajo realizado en esta tesis doctoral se centra en la implementación de nuevos algoritmos y de suaplicación en diferentes tipos de nanoestructuras. El programa científico en el que se han llevado a cabolas extensiones es una implementación eficiente de la teoría funcional de densidad dependiente deltiempo, conocida como MBPT-LCAO.Las principales extensiones realizadas son las siguientes: implementación de la espectroscopía de pérdidade energía de electrones en el espacio real, mejora del procedimiento iterativo para permitir cálculos degran tamaño sin precedentes, cálculo del campo eléctrico inducido e implementación de la espectroscopíade dispersión Raman.Estas implementaciones se han utilizado en agregados y agregados dímeros de sodio y plata, así como ennanotubos de carbono y nitruro de boro. Se han calculado tanto el espectro de absorción como los camposeléctricos inducidos para todos estos sistemas. De esta forma, este trabajo nos ha permitido entendermejor la respuesta de tales nanoestructuras bajo la influencia de una perturbación externa

    Compilation techniques and language support to facilitate dependence-driven computation

    Get PDF
    As the demand increases for high performance and power efficiency in modern computer runtime systems and architectures, programmers are left with the daunting challenge of fully exploiting these systems for efficiency, high-level expressibility, and portability across different computing architectures. Emerging programming models such as the task-based runtime StarPU and many-core architectures such as GPUs force programmers into choosing either low-level programming languages or putting complete faith in the compiler. As has been previously studied in extensive detail, both development approaches have their own respective trade-offs. The goal of this thesis is to help make parallel programming easier. It addresses these challenges by providing new compilation techniques for high-level programming languages that conform to commonly-accepted paradigms in order to leverage these emerging runtime systems and architectures. In particular, this dissertation makes several contributions to these challenges by leveraging the high-level programming language Chapel in order to efficiently map computation and data onto both the task-based runtime system StarPU and onto GPU-based accelerators. Different loop-based parallel programs and experiments are evaluated in order to measure the effectiveness of the proposed compiler algorithms and their optimizations, while also providing programmability metrics when leveraging high-level languages. In order to exploit additional performance when mapping onto shared memory systems, this thesis proposes a set of compiler and runtime-based heuristics that determine the profitable processor tile shapes and sizes when mapping multiply-nested parallel loops. Finally, a new benchmark-suite named P-Ray is presented. This is used to provide machine characteristics in a portable manner that can be used by either a compiler, an auto-tuning framework, or the programmer when optimizing their applications

    Plasmons in nanoparticles: atomistic Ab Initio theory for large systems

    Get PDF
    205 p.El trabajo realizado en esta tesis doctoral se centra en la implementación de nuevos algoritmos y de suaplicación en diferentes tipos de nanoestructuras. El programa científico en el que se han llevado a cabolas extensiones es una implementación eficiente de la teoría funcional de densidad dependiente deltiempo, conocida como MBPT-LCAO.Las principales extensiones realizadas son las siguientes: implementación de la espectroscopía de pérdidade energía de electrones en el espacio real, mejora del procedimiento iterativo para permitir cálculos degran tamaño sin precedentes, cálculo del campo eléctrico inducido e implementación de la espectroscopíade dispersión Raman.Estas implementaciones se han utilizado en agregados y agregados dímeros de sodio y plata, así como ennanotubos de carbono y nitruro de boro. Se han calculado tanto el espectro de absorción como los camposeléctricos inducidos para todos estos sistemas. De esta forma, este trabajo nos ha permitido entendermejor la respuesta de tales nanoestructuras bajo la influencia de una perturbación externa
    corecore