488 research outputs found

    A METHOD FOR COMPUTING RADIAL NETWORK SYSTEM RELIABILITY

    Get PDF

    RELIABILITY INVESTIGATIONS OF ELECTRIC DISTRIBUTION NETWORKS

    Get PDF

    Comparative evaluation of bandwidth-bound applications on the Intel Xeon CPU MAX Series

    Full text link
    In this paper we explore the performance of Intel Xeon MAX CPU Series, representing the most significant new variation upon the classical CPU architecture since the Intel Xeon Phi Processor. Given the availability of a large on-package high-bandwidth memory, the bandwidth-to-compute ratio has significantly shifted compared to other CPUs on the market. Since a large fraction of HPC workloads are sensitive to the available bandwidth, we explore how this architecture performs on a selection of HPC proxies and applications that are mostly sensitive to bandwidth, and how it compares to the previous 3rd generation Intel Xeon Scalable processors (codenamed Ice Lake) and an AMD EPYC 7003 Series Processor with 3D V-Cache Technology (codenamed Milan-X). We explore performance with different parallel implementations (MPI, MPI+OpenMP, MPI+SYCL), compiled with different compilers and flags, and executed with or without hyperthreading. We show how performance bottlenecks are shifted from bandwidth to communication latencies for some applications, and demonstrate speedups compared to the previous generation between 2.0x-4.3x

    Acceleration of a Full-scale Industrial CFD Application with OP2

    Get PDF

    Automatic parallel implementations of adjoint codes for structured mesh applications

    Get PDF
    Algorithmic Differentiation (AD) shown to be an essential tool to get sensitivity information for va in multiple areas of science such as Computational Fluid Dynamics (CFD) applications or finance. Yet there is no sufficient tool to ease the cost of providing performance portable AD codes, especially for modern hardware like GPU clusters. This paper sketches our plans and progress so far to extend the OPS framework with an adjoint tape (storage for descriptors of intermediate steps and intermediate states of variables) and shows preliminary performance results on CPU nodes. The OPS (Oxford Parallel library for Structured mesh solvers) has shown good performance and scaling on a wide range of HPC architectures. Our work aims to exploit the benefits of OPS to provide performance portable adjoint implementations for future structured mesh stencil applications using OPS with minimal modifications

    Bitwise Reproducible task execution on unstructured mesh applications

    Get PDF
    Many mesh applications use floating point arithmetic which do not necessarily hold the associative laws of algebra. This could cause the application to become unreproducible. In this paper we present some work on generating a method for unstructured mesh applications to provide bitwise reproducibility between separate runs, even if they are started with different number of MPI processes. We implement our work in the OP2 domain-specific library, which provides an API that abstracts the solution of unstructured mesh computations. We carry out a performance analysis of our method applied on two applications: a simple airfoil application, and a more complex Aero application which uses a finite element method and a conjugate-gradient algorithm. We show a 2.37Ă—to 1.49Ă— slowdown on this applications as a price for full bitwise reproducibility

    Osztják hősénekek

    Get PDF

    Detection and characterization of DNA damage

    Get PDF
    The DNA molecule is constantly subjected to endogenous and exogenous sources of damage which if left unrepaired can lead to genotoxic and cytotoxic outcomes. These lesions have been implicated in the development of numerous diseases, carcinogenesis, and aging. This research focuses on the formation of such lesions, and unlike current research, demonstrates the potential of combining multiple analytical techniques to characterize a potential damage detection system
    • …
    corecore