986 research outputs found

    Simulating the universe on an intercontinental grid of supercomputers

    Full text link
    Understanding the universe is hampered by the elusiveness of its most common constituent, cold dark matter. Almost impossible to observe, dark matter can be studied effectively by means of simulation and there is probably no other research field where simulation has led to so much progress in the last decade. Cosmological N-body simulations are an essential tool for evolving density perturbations in the nonlinear regime. Simulating the formation of large-scale structures in the universe, however, is still a challenge due to the enormous dynamic range in spatial and temporal coordinates, and due to the enormous computer resources required. The dynamic range is generally dealt with by the hybridization of numerical techniques. We deal with the computational requirements by connecting two supercomputers via an optical network and make them operate as a single machine. This is challenging, if only for the fact that the supercomputers of our choice are separated by half the planet, as one is located in Amsterdam and the other is in Tokyo. The co-scheduling of the two computers and the 'gridification' of the code enables us to achieve a 90% efficiency for this distributed intercontinental supercomputer.Comment: Accepted for publication in IEEE Compute

    The Cosmogrid Simulation: Statistical Properties of Small Dark Matter Halos

    Get PDF
    We present the results of the "Cosmogrid" cosmological N-body simulation suites based on the concordance LCDM model. The Cosmogrid simulation was performed in a 30Mpc box with 2048^3 particles. The mass of each particle is 1.28x10^5 Msun, which is sufficient to resolve ultra-faint dwarfs. We found that the halo mass function shows good agreement with the Sheth & Tormen fitting function down to ~10^7 Msun. We have analyzed the spherically averaged density profiles of the three most massive halos which are of galaxy group size and contain at least 170 million particles. The slopes of these density profiles become shallower than -1 at the inner most radius. We also find a clear correlation of halo concentration with mass. The mass dependence of the concentration parameter cannot be expressed by a single power law, however a simple model based on the Press-Schechter theory proposed by Navarro et al. gives reasonable agreement with this dependence. The spin parameter does not show a correlation with the halo mass. The probability distribution functions for both concentration and spin are well fitted by the log-normal distribution for halos with the masses larger than ~10^8 Msun. The subhalo abundance depends on the halo mass. Galaxy-sized halos have 50% more subhalos than ~10^{11} Msun halos have.Comment: 15 pages, 18 figures, accepted by Ap

    SUPPLEMENTARY NOTES WRITTEN LANGUAGE

    No full text
    partial redundancy elimination, global value numbering, optimizing compiler, just-in-time compiler, runtime compiler, Java virtual machine When developing a redundancy elimination algorithm for a runtime optimizing compiler, not only its optimizing power but also its analysis speed must be considered. We propose a fast and efficient algorithm called Partial Value Number Redundancy Elimination (PVNRE), which completely fuses Partial Redundancy Elimination (PRE) and Global Value Numbering (GVN). Using value numbers in the data-flow analyses, PVNRE can deal with data-dependent redundancy, and can quickly remove path-dependent partial redundancy by converting value numbers at join nodes on demand during the data-flow analyses

    Compression in Data Caches with Compressible Field Isolation for Recursive Data Structures

    No full text
    Abstract. We introduce a software/hardware scheme called the Field Array Compression Technique (FACT) which reduces cache misses due to recursive data structures. Using a data layout transformation, data with temporal affinity is gathered in contiguous memory, where the recursive pointers and integer fields are compressed. As a result, one cacheblock can capture a greater amount of data with temporal affinity, especially pointers, improving the prefetching effect of a cache-block. In addition, the compression enlarges the effective cache capacity. On a suite of pointer-intensive programs, FACT achieves a 41.6 % reduction in memory stall time and a 37.4 % speedup on average.
    corecore