112 research outputs found

    Kranc: a Mathematica application to generate numerical codes for tensorial evolution equations

    Full text link
    We present a suite of Mathematica-based computer-algebra packages, termed "Kranc", which comprise a toolbox to convert (tensorial) systems of partial differential evolution equations to parallelized C or Fortran code. Kranc can be used as a "rapid prototyping" system for physicists or mathematicians handling very complicated systems of partial differential equations, but through integration into the Cactus computational toolkit we can also produce efficient parallelized production codes. Our work is motivated by the field of numerical relativity, where Kranc is used as a research tool by the authors. In this paper we describe the design and implementation of both the Mathematica packages and the resulting code, we discuss some example applications, and provide results on the performance of an example numerical code for the Einstein equations.Comment: 24 pages, 1 figure. Corresponds to journal versio

    Turduckening black holes: an analytical and computational study

    Get PDF
    We provide a detailed analysis of several aspects of the turduckening technique for evolving black holes. At the analytical level we study the constraint propagation for a general family of BSSN-type formulation of Einstein's field equations and identify under what conditions the turducken procedure is rigorously justified and under what conditions constraint violations will propagate to the outside of the black holes. We present high-resolution spherically symmetric studies which verify our analytical predictions. Then we present three dimensional simulations of single distorted black holes using different variations of the turduckening method and also the puncture method. We study the effect that these different methods have on the coordinate conditions, constraint violations, and extracted gravitational waves. We find that the waves agree up to small but non-vanishing differences, caused by escaping superluminal gauge modes. These differences become smaller with increasing detector location.Comment: Minor changes to match the final version to appear in PR

    A mixed volume grid approach for the Euler and Navier-Stokes equations

    Get PDF
    An approach for solving the compressible Euler and Navier-Stokes equations upon meshes composed of nearly arbitrary polyhedra is described. Each polyhedron is constructed from an arbitrary number of triangular and quadrilateral face elements, allowing the unified treatment of tetrahedral, prismatic, pyramidal, and hexahedral cells, as well the general cut cells produced by Cartesian mesh approaches. The basics behind the numerical approach and the resulting data structures are described. The accuracy of the mixed volume grid approach is assessed by performing a grid refinement study upon a series of hexahedral, tetrahedral, prismatic, and Cartesian meshes for an analytic inviscid problem. A series of laminar validation cases are made, comparing the results upon differing grid topologies to each other, to theory, and experimental data. A computation upon a prismatic/tetrahedral mesh is made simulating the laminar flow over a wall/cylinder combination

    Dynamic Resource Allocation on Virtual Machines

    Get PDF
    Resource allocation is one of the main issue in cloud computing (rare resources will be distributed). Although having sufficient resources sometimes we cannot make use those properly. So we use resource allocation method for the sufficient usage of resources available. In resource allocation method user neither need to install hardware nor software for to access applications. In this paper the aim is to implement a virtual machine ( VM ) resource monitor in OpenNebula platform with web based interface, and to integrate a Dynamic Resource Allocation ( DRA ) method in virtual machine ( which is useful when overload occurs) and to show the experimental results of before DRA and after DRA in virtual machine

    Generating and auto-tuning parallel stencil codes

    Get PDF
    In this thesis, we present a software framework, Patus, which generates high performance stencil codes for different types of hardware platforms, including current multicore CPU and graphics processing unit architectures. The ultimate goals of the framework are productivity, portability (of both the code and performance), and achieving a high performance on the target platform. A stencil computation updates every grid point in a structured grid based on the values of its neighboring points. This class of computations occurs frequently in scientific and general purpose computing (e.g., in partial differential equation solvers or in image processing), justifying the focus on this kind of computation. The proposed key ingredients to achieve the goals of productivity, portability, and performance are domain specific languages (DSLs) and the auto-tuning methodology. The Patus stencil specification DSL allows the programmer to express a stencil computation in a concise way independently of hardware architecture-specific details. Thus, it increases the programmer productivity by disburdening her or him of low level programming model issues and of manually applying hardware platform-specific code optimization techniques. The use of domain specific languages also implies code reusability: once implemented, the same stencil specification can be reused on different hardware platforms, i.e., the specification code is portable across hardware architectures. Constructing the language to be geared towards a special purpose makes it amenable to more aggressive optimizations and therefore to potentially higher performance. Auto-tuning provides performance and performance portability by automated adaptation of implementation-specific parameters to the characteristics of the hardware on which the code will run. By automating the process of parameter tuning — which essentially amounts to solving an integer programming problem in which the objective function is the number representing the code's performance as a function of the parameter configuration, — the system can also be used more productively than if the programmer had to fine-tune the code manually. We show performance results for a variety of stencils, for which Patus was used to generate the corresponding implementations. The selection includes stencils taken from two real-world applications: a simulation of the temperature within the human body during hyperthermia cancer treatment and a seismic application. These examples demonstrate the framework's flexibility and ability to produce high performance code

    ICASE semiannual report, April 1 - September 30, 1989

    Get PDF
    The Institute conducts unclassified basic research in applied mathematics, numerical analysis, and computer science in order to extend and improve problem-solving capabilities in science and engineering, particularly in aeronautics and space. The major categories of the current Institute for Computer Applications in Science and Engineering (ICASE) research program are: (1) numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; (2) control and parameter identification problems, with emphasis on effective numerical methods; (3) computational problems in engineering and the physical sciences, particularly fluid dynamics, acoustics, and structural analysis; and (4) computer systems and software, especially vector and parallel computers. ICASE reports are considered to be primarily preprints of manuscripts that have been submitted to appropriate research journals or that are to appear in conference proceedings

    Comparing Matrix-based and Matrix-free Discrete Adjoint Approaches to the Euler Equations

    Get PDF

    Large-scale performance of a DSL-based multi-block structured-mesh application for Direct Numerical Simulation

    Get PDF
    SBLI (Shock-wave/Boundary-layer Interaction) is a large-scale Computational Fluid Dynamics (CFD) application, developed over 20 years at the University of Southampton and extensively used within the UK Turbulence Consortium. It is capable of performing Direct Numerical Simulations (DNS) or Large Eddy Simulation (LES) of shock-wave/boundary-layer interaction problems over highly detailed multi-block structured mesh geometries. SBLI presents major challenges in data organization and movement that need to be overcome for continued high performance on emerging massively parallel hardware platforms. In this paper we present research in achieving this goal through the OPS embedded domain-specific language. OPS targets the domain of multi-block structured mesh applications. It provides an API embedded in C/C++ and Fortran and makes use of automatic code generation and compilation to produce executables capable of running on a range of parallel hardware systems. The core functionality of SBLI is captured using a new framework called OpenSBLI which enables a developer to declare the partial differential equations using Einstein notation and then automatically carryout discretization and generation of OPS (C/C++) API code. OPS is then used to automatically generate a wide range of parallel implementations. Using this multi-layered abstractions approach we demonstrate how new opportunities for further optimizations can be gained, such as fine-tuning the computation intensity and reducing data movement and apply them automatically. Performance results demonstrate there is no performance loss due to the high-level development strategy with OPS and OpenSBLI, with performance matching or exceeding the hand-tuned original code on all CPU nodes tested. The data movement optimizations provide over 3× speedups on CPU nodes, while GPUs provide 5× speedups over the best performing CPU node. The OPS generated parallel code also demonstrates excellent scalability on nearly 100K cores on a Cray XC30 (ARCHER at EPCC) and on over 4K GPUs on a CrayXK7 (Titan at ORNL)
    • …
    corecore