192 research outputs found

    How to Correctly Deal With Pseudorandom Numbers in Manycore Environments - Application to GPU programming with Shoverand

    Get PDF
    International audienceStochastic simulations are often sensitive to the source of randomness that character-izes the statistical quality of their results. Consequently, we need highly reliable Random Number Generators (RNGs) to feed such applications. Recent developments try to shrink the computa-tion time by relying more and more General Purpose Graphics Processing Units (GP-GPUs) to speed-up stochastic simulations. Such devices bring new parallelization possibilities, but they also introduce new programming difficulties. Since RNGs are at the base of any stochastic simulation, they also need to be ported to GP-GPU. There is still a lack of well-designed implementations of quality-proven RNGs on GP-GPU platforms. In this paper, we introduce ShoveRand, a frame-work defining common rules to generate random numbers uniformly on GP-GPU. Our framework is designed to cope with any GPU-enabled development platform and to expose a straightfor-ward interface to users. We also provide an existing RNG implementation with this framework to demonstrate its efficiency in both development and ease of use

    Best practices for HPM-assisted performance engineering on modern multicore processors

    Full text link
    Many tools and libraries employ hardware performance monitoring (HPM) on modern processors, and using this data for performance assessment and as a starting point for code optimizations is very popular. However, such data is only useful if it is interpreted with care, and if the right metrics are chosen for the right purpose. We demonstrate the sensible use of hardware performance counters in the context of a structured performance engineering approach for applications in computational science. Typical performance patterns and their respective metric signatures are defined, and some of them are illustrated using case studies. Although these generic concepts do not depend on specific tools or environments, we restrict ourselves to modern x86-based multicore processors and use the likwid-perfctr tool under the Linux OS.Comment: 10 pages, 2 figure

    Open Data Market Architecture and Functional Components

    Get PDF

    Búsquedas por rango sobre plataformas GPU en espacios métricos

    Get PDF
    La búsqueda por similitud consiste en recuperar todos aquellos objetos dentro de una base de datos que sean parecidos o relevantes a una determinada consulta. Actualmente es un tema de gran interés para la comunidad científica debido a sus múltiples campos de aplicación, como reconocimiento de patrones, recuperación de la información, bases de datos multimedia, entre otros. La búsqueda por similitud o en proximidad se modela matemáticamente a través de un espacio métrico, en el cual un objeto es representado como una caja negra, donde la única información disponible es la función de distancia de este objeto a los otros. En general, el cálculo de la distancia es costoso, por ello el objetivo es reducir la cantidad de evaluaciones de distancia necesarias para resolver la consulta. Para esto se han desarrollado numerosas estructuras métricas, que realizan un preprocesamiento de los datos a fin de disminuir las evaluaciones de distancia al momento de la búsqueda. En la actualidad, la necesidad de procesar grandes volúmenes de datos hace poco factible la utilización de una estructura en aplicaciones reales si ésta no considera la utilización de sistemas de altas prestaciones en entornos de procesamiento paralelo. Existen una serie de tecnologías para realizar implementaciones paralelas, siendo una de las más nuevas, las plataformas basadas en GPU / Multi- GPU, que son interesantes debido a la cantidad de procesadores y los bajos costes involucrados.Eje: Procesamiento Distribuido y ParaleloRed de Universidades con Carreras en Informática (RedUNCI

    A survey on pseudonym changing strategies for Vehicular Ad-Hoc Networks

    Full text link
    The initial phase of the deployment of Vehicular Ad-Hoc Networks (VANETs) has begun and many research challenges still need to be addressed. Location privacy continues to be in the top of these challenges. Indeed, both of academia and industry agreed to apply the pseudonym changing approach as a solution to protect the location privacy of VANETs'users. However, due to the pseudonyms linking attack, a simple changing of pseudonym shown to be inefficient to provide the required protection. For this reason, many pseudonym changing strategies have been suggested to provide an effective pseudonym changing. Unfortunately, the development of an effective pseudonym changing strategy for VANETs is still an open issue. In this paper, we present a comprehensive survey and classification of pseudonym changing strategies. We then discuss and compare them with respect to some relevant criteria. Finally, we highlight some current researches, and open issues and give some future directions

    A Dynamic Run-Profile Energy-Aware Approach for Scheduling Computationally Intensive Bioinformatics Applications

    Get PDF
    High Performance Computing (HPC) resources are housed in large datacenters, which consume exorbitant amounts of energy and are quickly demanding attention from businesses as they result in high operating costs. On the other hand HPC environments have been very useful to researchers in many emerging areas in life sciences such as Bioinformatics and Medical Informatics. In an earlier work, we introduced a dynamic model for energy aware scheduling (EAS) in a HPC environment; the model is domain agnostic and incorporates both the deadline parameter as well as energy parameters for computationally intensive applications. Our proposed EAS model incorporates 2-phases. In the Offline Phase, we use a run profile based approach to generate the initial schedule. In the Online Phase a feedback mechanism is incorporated between the EAS Engine and the master scheduling process. As scheduled tasks are completed, actual execution times are used to adjust the resources required for scheduling remaining tasks using the least number of nodes while meeting a given deadline. In this paper we study the impact of the quality of initial schedule using different run profiles which is the starting point for the EAS algorithm on the number of adjustments which is critical to the overall energy optimization as every adjustment made has an overhead. The conducted experiments show that the proposed approach succeeded in meeting preset deadlines while minimizing the number of nodes; thus reducing overall energy utilized and that choosing the right profile in the Offline phase has an impact on the energy optimization achieved by the EAS algorithm

    Applications Resilience on Clouds

    Get PDF
    International audienceCloud computing infrastructures support system and network fault-tolerance. They transparently repair and prevent communication and software errors. They also allow duplication and migration of jobs and data to prevent hardware failures. However, only limited work has been done so far on application resilience, i.e., the ability to resume normal execution after errors and abnormal executions in distributed environments and clouds. This paper addresses open issues and solutions for application errors detection and management. It also overviews a testbed used to to design, deploy, execute, monitor, restart and resume distributed applications on cloud infrastructures in cases of failures

    On the Overhead of Topology Discovery for Locality-aware Scheduling in HPC

    Get PDF
    International audienceThe increasing complexity of parallel computing platforms requires a deep knowledge of the hardware and of the application needs. Locality a key criteria for performance optimization. It involves software tools to expose information about the hardware topology to high performance runtime libraries. We show that the overhead of gathering such information from the operating system is significant on large computing nodes that run Linux. This overhead also increases more than linearly with the number of processes that perform it simultaneously. We then study the actual needs of the HPC software ecosystem in terms of topology information. We propose some ways to avoid multiple expensive topology discovery and to share topology information between components such as the resource manager or the runtime libraries
    • …
    corecore