38,345 research outputs found

    Effective Monte Carlo simulation on System-V massively parallel associative string processing architecture

    Get PDF
    We show that the latest version of massively parallel processing associative string processing architecture (System-V) is applicable for fast Monte Carlo simulation if an effective on-processor random number generator is implemented. Our lagged Fibonacci generator can produce 10810^8 random numbers on a processor string of 12K PE-s. The time dependent Monte Carlo algorithm of the one-dimensional non-equilibrium kinetic Ising model performs 80 faster than the corresponding serial algorithm on a 300 MHz UltraSparc.Comment: 8 pages, 9 color ps figures embedde

    Update statistics in conservative parallel discrete event simulations of asynchronous systems

    Full text link
    We model the performance of an ideal closed chain of L processing elements that work in parallel in an asynchronous manner. Their state updates follow a generic conservative algorithm. The conservative update rule determines the growth of a virtual time surface. The physics of this growth is reflected in the utilization (the fraction of working processors) and in the interface width. We show that it is possible to nake an explicit connection between the utilization and the macroscopic structure of the virtual time interface. We exploit this connection to derive the theoretical probability distribution of updates in the system within an approximate model. It follows that the theoretical lower bound for the computational speed-up is s=(L+1)/4 for L>3. Our approach uses simple statistics to count distinct surface configuration classes consistent with the model growth rule. It enables one to compute analytically microscopic properties of an interface, which are unavailable by continuum methods.Comment: 15 pages, 12 figure

    Parallel Peeling Algorithms

    Full text link
    The analysis of several algorithms and data structures can be framed as a peeling process on a random hypergraph: vertices with degree less than k are removed until there are no vertices of degree less than k left. The remaining hypergraph is known as the k-core. In this paper, we analyze parallel peeling processes, where in each round, all vertices of degree less than k are removed. It is known that, below a specific edge density threshold, the k-core is empty with high probability. We show that, with high probability, below this threshold, only (log log n)/log(k-1)(r-1) + O(1) rounds of peeling are needed to obtain the empty k-core for r-uniform hypergraphs. Interestingly, we show that above this threshold, Omega(log n) rounds of peeling are required to find the non-empty k-core. Since most algorithms and data structures aim to peel to an empty k-core, this asymmetry appears fortunate. We verify the theoretical results both with simulation and with a parallel implementation using graphics processing units (GPUs). Our implementation provides insights into how to structure parallel peeling algorithms for efficiency in practice.Comment: Appears in SPAA 2014. Minor typo corrections relative to previous versio

    Doubly Charmed Baryons in COMPASS

    Full text link
    The search for doubly charmed baryons has been a topic for COMPASS from the beginning. Requiring however a complete spectrometer and highest possible trigger rates this measurement has been postponed. The scenario for such a measurement in the second phase of COMPASS is outlined here. First studies of triggering and simulation of the setup have been performed. New rate estimates based on recent measurements from SELEX at FNAL are presented.Comment: 13 pages, 15 figures, contribution to the Workshop on Future Physics at COMPASS, CERN, Geneva, September 26-27 2002, to appear as CERN Yellow Repor

    Parallelization of a Six Degree of Freedom Entry Vehicle Trajectory Simulation Using OpenMP and OpenACC

    Get PDF
    The art and science of writing parallelized software, using methods such as Open Multi-Processing (OpenMP) and Open Accelerators (OpenACC), is dominated by computer scientists. Engineers and non-computer scientists looking to apply these techniques to their project applications face a steep learning curve, especially when looking to adapt their original single threaded software to run multi-threaded on graphics processing units (GPUs). There are significant changes in mindset that must occur; such as how to manage memory, the organization of instructions, and the use of if statements (also known as branching). The purpose of this work is twofold: 1) to demonstrate the applicability of parallelized coding methodologies, OpenMP and OpenACC, to tasks outside of the typical large scale matrix mathematics; and 2) to discuss, from an engineers perspective, the lessons learned from parallelizing software using these computer science techniques. This work applies OpenMP, on both multi-core central processing units (CPUs) and Intel Xeon Phi 7210, and OpenACC on GPUs. These parallelization techniques are used to tackle the simulation of thousands of entry vehicle trajectories through the integration of six degree of freedom (DoF) equations of motion (EoM). The forces and moments acting on the entry vehicle, and used by the EoM, are estimated using multiple models of varying levels of complexity. Several benchmark comparisons are made on the execution of six DoF trajectory simulation: single thread Intel Xeon E5-2670 CPU, multi-thread CPU using OpenMP, multi-thread Xeon Phi 7210 using OpenMP, and multi-thread NVIDIA Tesla K40 GPU using OpenACC. These benchmarks are run on the Pleiades Supercomputer Cluster at the National Aeronautics and Space Administration (NASA) Ames Research Center (ARC), and a Xeon Phi 7210 node at NASA Langley Research Center (LaRC)

    Simulating streamer discharges in 3D with the parallel adaptive Afivo framework

    Get PDF
    We present an open-source plasma fluid code for 2D, cylindrical and 3D simulations of streamer discharges, based on the Afivo framework that features adaptive mesh refinement, geometric multigrid methods for Poisson's equation, and OpenMP parallelism. We describe the numerical implementation of a fluid model of the drift-diffusion-reaction type, combined with the local field approximation. Then we demonstrate its functionality with 3D simulations of long positive streamers in nitrogen in undervolted gaps, using three examples. The first example shows how a stochastic background density affects streamer propagation and branching. The second one focuses on the interaction of a streamer with preionized regions, and the third one investigates the interaction between two streamers. The simulations run on up to 10810^8 grid cells within less than a day. Without mesh refinement, they would require 4â‹…10124\cdot 10^{12} grid cells

    Avalanches in self-organized critical neural networks: A minimal model for the neural SOC universality class

    Full text link
    The brain keeps its overall dynamics in a corridor of intermediate activity and it has been a long standing question what possible mechanism could achieve this task. Mechanisms from the field of statistical physics have long been suggesting that this homeostasis of brain activity could occur even without a central regulator, via self-organization on the level of neurons and their interactions, alone. Such physical mechanisms from the class of self-organized criticality exhibit characteristic dynamical signatures, similar to seismic activity related to earthquakes. Measurements of cortex rest activity showed first signs of dynamical signatures potentially pointing to self-organized critical dynamics in the brain. Indeed, recent more accurate measurements allowed for a detailed comparison with scaling theory of non-equilibrium critical phenomena, proving the existence of criticality in cortex dynamics. We here compare this new evaluation of cortex activity data to the predictions of the earliest physics spin model of self-organized critical neural networks. We find that the model matches with the recent experimental data and its interpretation in terms of dynamical signatures for criticality in the brain. The combination of signatures for criticality, power law distributions of avalanche sizes and durations, as well as a specific scaling relationship between anomalous exponents, defines a universality class characteristic of the particular critical phenomenon observed in the neural experiments. The spin model is a candidate for a minimal model of a self-organized critical adaptive network for the universality class of neural criticality. As a prototype model, it provides the background for models that include more biological details, yet share the same universality class characteristic of the homeostasis of activity in the brain.Comment: 17 pages, 5 figure
    • …
    corecore