2,896 research outputs found

    An Algebra of Synchronous Scheduling Interfaces

    Full text link
    In this paper we propose an algebra of synchronous scheduling interfaces which combines the expressiveness of Boolean algebra for logical and functional behaviour with the min-max-plus arithmetic for quantifying the non-functional aspects of synchronous interfaces. The interface theory arises from a realisability interpretation of intuitionistic modal logic (also known as Curry-Howard-Isomorphism or propositions-as-types principle). The resulting algebra of interface types aims to provide a general setting for specifying type-directed and compositional analyses of worst-case scheduling bounds. It covers synchronous control flow under concurrent, multi-processing or multi-threading execution and permits precise statements about exactness and coverage of the analyses supporting a variety of abstractions. The paper illustrates the expressiveness of the algebra by way of some examples taken from network flow problems, shortest-path, task scheduling and worst-case reaction times in synchronous programming.Comment: In Proceedings FIT 2010, arXiv:1101.426

    Sparse Tensor Transpositions

    Full text link
    We present a new algorithm for transposing sparse tensors called Quesadilla. The algorithm converts the sparse tensor data structure to a list of coordinates and sorts it with a fast multi-pass radix algorithm that exploits knowledge of the requested transposition and the tensors input partial coordinate ordering to provably minimize the number of parallel partial sorting passes. We evaluate both a serial and a parallel implementation of Quesadilla on a set of 19 tensors from the FROSTT collection, a set of tensors taken from scientific and data analytic applications. We compare Quesadilla and a generalization, Top-2-sadilla to several state of the art approaches, including the tensor transposition routine used in the SPLATT tensor factorization library. In serial tests, Quesadilla was the best strategy for 60% of all tensor and transposition combinations and improved over SPLATT by at least 19% in half of the combinations. In parallel tests, at least one of Quesadilla or Top-2-sadilla was the best strategy for 52% of all tensor and transposition combinations.Comment: This work will be the subject of a brief announcement at the 32nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '20

    High accuracy binary black hole simulations with an extended wave zone

    Get PDF
    We present results from a new code for binary black hole evolutions using the moving-puncture approach, implementing finite differences in generalised coordinates, and allowing the spacetime to be covered with multiple communicating non-singular coordinate patches. Here we consider a regular Cartesian near zone, with adapted spherical grids covering the wave zone. The efficiencies resulting from the use of adapted coordinates allow us to maintain sufficient grid resolution to an artificial outer boundary location which is causally disconnected from the measurement. For the well-studied test-case of the inspiral of an equal-mass non-spinning binary (evolved for more than 8 orbits before merger), we determine the phase and amplitude to numerical accuracies better than 0.010% and 0.090% during inspiral, respectively, and 0.003% and 0.153% during merger. The waveforms, including the resolved higher harmonics, are convergent and can be consistently extrapolated to r→∞r\to\infty throughout the simulation, including the merger and ringdown. Ringdown frequencies for these modes (to (ℓ,m)=(6,6)(\ell,m)=(6,6)) match perturbative calculations to within 0.01%, providing a strong confirmation that the remnant settles to a Kerr black hole with irreducible mass Mirr=0.884355±20×10−6M_{\rm irr} = 0.884355\pm20\times10^{-6} and spin $S_f/M_f^2 = 0.686923 \pm 10\times10^{-6}

    A massively parallel exponential integrator for advection-diffusion models

    Get PDF
    This work considers the Real Leja Points Method (ReLPM) for the exponential integration of large-scale sparse systems of ODEs, generated by Finite Element or Finite Difference discretizations of 3-D advection-diffusion models. We present an efficient parallel implementation of ReLPM for polynomial interpolation of the matrix exponential propagators. A scalability analysis of the most important computational kernel inside the code, the parallel sparse matrix\u2013vector product, has been performed, as well as an experimental study of the communication overhead. As a result of this study an optimized parallel sparse matrix\u2013vector product routine has been implemented. The resulting code shows good scaling behavior even when using more than one thousand processors. The numerical results presented on a number of very large test cases gives experimental evidence that ReLPM is a reliable and efficient tool for the simulation of complex hydrodynamic processes on parallel architectures
    • …
    corecore