3,103 research outputs found

    Accelerating Monte Carlo simulations with an NVIDIA® graphics processor

    Get PDF
    Modern graphics cards, commonly used in desktop computers, have evolved beyond a simple interface between processor and display to incorporate sophisticated calculation engines that can be applied to general purpose computing. The Monte Carlo algorithm for modelling photon transport in turbid media has been implemented on an NVIDIA® 8800gt graphics card using the CUDA toolkit. The Monte Carlo method relies on following the trajectory of millions of photons through the sample, often taking hours or days to complete. The graphics-processor implementation, processing roughly 110 million scattering events per second, was found to run more than 70 times faster than a similar, single-threaded implementation on a 2.67 GHz desktop computer

    Investigation of LSTM Based Prediction for Dynamic Energy Management in Chip Multiprocessors

    Get PDF
    In this paper, we investigate the effectiveness of using long short-term memory (LSTM) instead of Kalman filtering to do prediction for the purpose of constructing dynamic energy management (DEM) algorithms in chip multi-processors (CMPs). Either of the two prediction methods is employed to estimate the workload in the next control period for each of the processor cores. These estimates are then used to select voltage-frequency (VF) pairs for each core of the CMP during the next control period as part of a dynamic voltage and frequency scaling (DVFS) technique. The objective of the DVFS technique is to reduce energy consumption under performance constraints that are set by the user. We conduct our investigation using a custom Sniper system simulation framework. Simulation results for 16 and 64 core network-on-chip based CMP architectures and using several benchmarks demonstrate that the LSTM is slightly better than Kalman filtering

    Performance Debugging and Tuning using an Instruction-Set Simulator

    Get PDF
    Instruction-set simulators allow programmers a detailed level of insight into, and control over, the execution of a program, including parallel programs and operating systems. In principle, instruction set simulation can model any target computer and gather any statistic. Furthermore, such simulators are usually portable, independent of compiler tools, and deterministic-allowing bugs to be recreated or measurements repeated. Though often viewed as being too slow for use as a general programming tool, in the last several years their performance has improved considerably. We describe SIMICS, an instruction set simulator of SPARC-based multiprocessors developed at SICS, in its rĂ´le as a general programming tool. We discuss some of the benefits of using a tool such as SIMICS to support various tasks in software engineering, including debugging, testing, analysis, and performance tuning. We present in some detail two test cases, where we've used SimICS to support analysis and performance tuning of two applications, Penny and EQNTOTT. This work resulted in improved parallelism in, and understanding of, Penny, as well as a performance improvement for EQNTOTT of over a magnitude. We also present some early work on analyzing SPARC/Linux, demonstrating the ability of tools like SimICS to analyze operating systems

    Comparison of Different Parallel Implementations of the 2+1-Dimensional KPZ Model and the 3-Dimensional KMC Model

    Full text link
    We show that efficient simulations of the Kardar-Parisi-Zhang interface growth in 2 + 1 dimensions and of the 3-dimensional Kinetic Monte Carlo of thermally activated diffusion can be realized both on GPUs and modern CPUs. In this article we present results of different implementations on GPUs using CUDA and OpenCL and also on CPUs using OpenCL and MPI. We investigate the runtime and scaling behavior on different architectures to find optimal solutions for solving current simulation problems in the field of statistical physics and materials science.Comment: 14 pages, 8 figures, to be published in a forthcoming EPJST special issue on "Computer simulations on GPU
    • …
    corecore