326,298 research outputs found

    A benchmark study on mantle convection in a 3-D spherical shell using CitcomS

    Get PDF
    As high-performance computing facilities and sophisticated modeling software become available, modeling mantle convection in a three-dimensional (3-D) spherical shell geometry with realistic physical parameters and processes becomes increasingly feasible. However, there is still a lack of comprehensive benchmark studies for 3-D spherical mantle convection. Here we present benchmark and test calculations using a finite element code CitcomS for 3-D spherical convection. Two classes of model calculations are presented: the Stokes' flow and thermal and thermochemical convection. For Stokes' flow, response functions of characteristic flow velocity, topography, and geoid at the surface and core-mantle boundary (CMB) at different spherical harmonic degrees are computed using CitcomS and are compared with those from analytic solutions using a propagator matrix method. For thermal and thermochemical convection, 24 cases are computed with different model parameters including Rayleigh number (7 × 10^3 or 10^5) and viscosity contrast due to temperature dependence (1 to 10^7). For each case, time-averaged quantities at the steady state are computed, including surface and CMB Nussult numbers, RMS velocity, averaged temperature, and maximum and minimum flow velocity, and temperature at the midmantle depth and their standard deviations. For thermochemical convection cases, in addition to outputs for thermal convection, we also quantified entrainment of an initially dense component of the convection and the relative errors in conserving its volume. For nine thermal convection cases that have small viscosity variations and where previously published results were available, we find that the CitcomS results are mostly consistent with these previously published with less than 1% relative differences in globally averaged quantities including Nussult numbers and RMS velocities. For other 15 cases with either strongly temperature-dependent viscosity or thermochemical convection, no previous calculations are available for comparison, but these 15 test calculations from CitcomS are useful for future code developments and comparisons. We also presented results for parallel efficiency for CitcomS, showing that the code achieves 57% efficiency with 3072 cores on Texas Advanced Computing Center's parallel supercomputer Ranger

    Algebraic Methods in the Congested Clique

    Full text link
    In this work, we use algebraic methods for studying distance computation and subgraph detection tasks in the congested clique model. Specifically, we adapt parallel matrix multiplication implementations to the congested clique, obtaining an O(n1−2/ω)O(n^{1-2/\omega}) round matrix multiplication algorithm, where ω<2.3728639\omega < 2.3728639 is the exponent of matrix multiplication. In conjunction with known techniques from centralised algorithmics, this gives significant improvements over previous best upper bounds in the congested clique model. The highlight results include: -- triangle and 4-cycle counting in O(n0.158)O(n^{0.158}) rounds, improving upon the O(n1/3)O(n^{1/3}) triangle detection algorithm of Dolev et al. [DISC 2012], -- a (1+o(1))(1 + o(1))-approximation of all-pairs shortest paths in O(n0.158)O(n^{0.158}) rounds, improving upon the O~(n1/2)\tilde{O} (n^{1/2})-round (2+o(1))(2 + o(1))-approximation algorithm of Nanongkai [STOC 2014], and -- computing the girth in O(n0.158)O(n^{0.158}) rounds, which is the first non-trivial solution in this model. In addition, we present a novel constant-round combinatorial algorithm for detecting 4-cycles.Comment: This is work is a merger of arxiv:1412.2109 and arxiv:1412.266

    Automating Fault Tolerance in High-Performance Computational Biological Jobs Using Multi-Agent Approaches

    Get PDF
    Background: Large-scale biological jobs on high-performance computing systems require manual intervention if one or more computing cores on which they execute fail. This places not only a cost on the maintenance of the job, but also a cost on the time taken for reinstating the job and the risk of losing data and execution accomplished by the job before it failed. Approaches which can proactively detect computing core failures and take action to relocate the computing core's job onto reliable cores can make a significant step towards automating fault tolerance. Method: This paper describes an experimental investigation into the use of multi-agent approaches for fault tolerance. Two approaches are studied, the first at the job level and the second at the core level. The approaches are investigated for single core failure scenarios that can occur in the execution of parallel reduction algorithms on computer clusters. A third approach is proposed that incorporates multi-agent technology both at the job and core level. Experiments are pursued in the context of genome searching, a popular computational biology application. Result: The key conclusion is that the approaches proposed are feasible for automating fault tolerance in high-performance computing systems with minimal human intervention. In a typical experiment in which the fault tolerance is studied, centralised and decentralised checkpointing approaches on an average add 90% to the actual time for executing the job. On the other hand, in the same experiment the multi-agent approaches add only 10% to the overall execution time.Comment: Computers in Biology and Medicin

    A domain decomposing parallel sparse linear system solver

    Get PDF
    The solution of large sparse linear systems is often the most time-consuming part of many science and engineering applications. Computational fluid dynamics, circuit simulation, power network analysis, and material science are just a few examples of the application areas in which large sparse linear systems need to be solved effectively. In this paper we introduce a new parallel hybrid sparse linear system solver for distributed memory architectures that contains both direct and iterative components. We show that by using our solver one can alleviate the drawbacks of direct and iterative solvers, achieving better scalability than with direct solvers and more robustness than with classical preconditioned iterative solvers. Comparisons to well-known direct and iterative solvers on a parallel architecture are provided.Comment: To appear in Journal of Computational and Applied Mathematic
    • …
    corecore