4 research outputs found

    A framework for efficient execution of data parallel irregular applications on heterogeneous systems

    Get PDF
    Exploiting the computing power of the diversity of resources available on heterogeneous systems is mandatory but a very challenging task. The diversity of architectures, execution models and programming tools, together with disjoint address spaces and di erent computing capabilities, raise a number of challenges that severely impact on application performance and programming productivity. This problem is further compounded in the presence of data parallel irregular applications. This paper presents a framework that addresses development and execution of data parallel irregular applications in heterogeneous systems. A uni ed task-based programming and execution model is proposed, together with inter and intra-device scheduling, which, coupled with a data management system, aim to achieve performance scalability across multiple devices, while maintaining high programming productivity. Intradevice scheduling on wide SIMD/SIMT architectures resorts to consumer-producer kernels, which, by allowing dynamic generation and rescheduling of new work units, enable balancing irregular workloads and increase resource utilization. Results show that regular and irregular applications scale well with the number of devices, while requiring minimal programming e ort. Consumer-producer kernels are able to sustain signi cant performance gains as long as the workload per basic work unit is enough to compensate overheads associated with intra-device scheduling. This not being the case, consumer kernels can still be used for the irregular application. Comparisons with an alternative framework, StarPU, which targets regular workloads, consistently demonstrate signi cant speedups. This is, to the best of our knowledge, the rst published integrated approach that successfully handles irregular workloads over heterogeneous systems.This work is funded by National Funds through the FCT - Funda莽茫o para a Ci锚ncia e a Tecnologia (Portuguese Foundation for Science and Technology) and by ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) within projects PEst-OE/EEI/UI0752/2014 and FCOMP-01-0124-FEDER-010067. Also by the School of Engineering, Universidade do Minho within project P2SHOCS - Performance Portability on Scalable Heterogeneous Computing Systems

    Practical global illumination for interactive particle visualization

    Get PDF
    ManuscriptParticle-based simulation methods are used to model a wide range of complex phenomena and to solve time-dependent problems of various scales. Effective visualizations of the resulting state will communicate subtle changes in the three-dimensional structure, spatial organization, and qualitative trends within a simulation as it evolves. We present two algorithms targeting upcoming, highly parallel multicore desktop systems to enable interactive navigation and exploration of large particle datasets with global illumination effects. Monte Carlo path tracing and texture mapping are used to capture computationally expensive illumination effects such as soft shadows and diffuse interreflection. The first approach is based on precomputation of luminance textures and removes expensive illumination calculations from the interactive rendering pipeline. The second approach is based on dynamic luminance texture generation and decouples interactive rendering from the computation of global illumination effects. These algorithms provide visual cues that enhance the ability to perform analysis and feature detection tasks while interrogating the data at interactive rates. We explore the performance of these algorithms and demonstrate their effectiveness using several large datasets

    dMitsuba 2/dt: Transporte de luz transitorio en Mitsuba 2

    Get PDF
    Una de las asunciones m谩s establecidas en inform谩tica gr谩fica y visi贸n por computador es considerar que la velocidad de la luz es infinita. La introducci贸n de la femto-fotograf铆a (captura de escenas a exposiciones efectivas de picosegundos) inaugur贸 el campo de la imagen transitoria, donde se aprovecha la informaci贸n codificada en el dominio temporal del transporte de luz para resolver problemas como la estimaci贸n de profundidad o visi贸n de escenas ocluidas a trav茅s de esquinas. Estas t茅cnicas de captura de imagen en estado transitorio requieren nuevas herramientas que permitan simular el transporte de luz a escalas comparables con la velocidad de la luz, eliminando las asunci贸n de velocidad de la luz infinita. Uno de los mayores retos de la simulaci贸n de luz en estado transitorio es el dr谩stico aumento del tiempo de ejecuci贸n de estos algoritmos.El objetivo de este trabajo consiste en generalizar el software de simulaci贸n de transporte Mitsuba 2 al estado transitorio. Mitsuba 2 es un renderizador estacionario basado en f铆sica, que incluye soporte para imagen multiespectral y polarizaci贸n, as铆 como ejecuci贸n vectorizada y en GPU. Generalizarlo a estado transitorio permite desarrollar un renderizador transitorio eficiente, que compense el dram谩tico aumento del tiempo de convergencia de los algoritmos de simulaci贸n de luz en estado transitorio. Adem谩s, est谩 adaptado para su uso como render diferenciable, permitiendo calcular la derivada de una imagen sint茅tica respecto a cualquiera de sus par谩metros.Este proyecto implementa las bases de la simulaci贸n de la luz en estado transitorio en Mitsuba 2. Para ello, implementamos el algoritmo de path tracing en estado transitorio sobre Mitsuba 2, incluyendo soporte para render espectral y polarizado y su ejecuci贸n en CPU de manera vectorizada. Finalmente, se usa el renderizador implementado para analizar efectos de la propagaci贸n de la luz solo visibles al considerar la velocidad de la luz finita. Este trabajo sienta las bases para el desarrollo de futuras extensiones a nuevos algoritmos de transporte de luz transitorios, nuevos modelos de interacci贸n luz-materia resueltos en tiempo y mejores algoritmos de reconstrucci贸n en el dominio temporal. Adem谩s, su modalidad diferenciable puede ser muy 煤til para nuevos problemas inversos que hagan uso de la imagen transitoria, como p.ej. ver a trav茅s de esquinas. <br /
    corecore