685 research outputs found

    Computer architecture evaluation for structural dynamics computations: Project summary

    Get PDF
    The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies

    Single-Producer/Single-Consumer Queues on Shared Cache Multi-Core Systems

    Full text link
    Using efficient point-to-point communication channels is critical for implementing fine grained parallel program on modern shared cache multi-core architectures. This report discusses in detail several implementations of wait-free Single-Producer/Single-Consumer queue (SPSC), and presents a novel and efficient algorithm for the implementation of an unbounded wait-free SPSC queue (uSPSC). The correctness proof of the new algorithm, and several performance measurements based on simple synthetic benchmark and microbenchmark, are also discussed

    An architecture for intelligent task interruption

    Get PDF
    In the design of real time systems the capability for task interruption is often considered essential. The problem of task interruption in knowledge-based domains is examined. It is proposed that task interruption can be often avoided by using appropriate functional architectures and knowledge engineering principles. Situations for which task interruption is indispensable, a preliminary architecture based on priority hierarchies is described

    Sentinels: A concept for multiprocess coordination

    Get PDF
    Journal ArticleThe sentinel construct is introduced, which provides a certain syntactic and semantic framework for multiprocess coordination. The advantage of this construct over others is argued to be semantic transparency, efficiency, ease in implementation, and usefulness in verfication

    Virtualizing the Stampede2 Supercomputer with Applications to HPC in the Cloud

    Full text link
    Methods developed at the Texas Advanced Computing Center (TACC) are described and demonstrated for automating the construction of an elastic, virtual cluster emulating the Stampede2 high performance computing (HPC) system. The cluster can be built and/or scaled in a matter of minutes on the Jetstream self-service cloud system and shares many properties of the original Stampede2, including: i) common identity management, ii) access to the same file systems, iii) equivalent software application stack and module system, iv) similar job scheduling interface via Slurm. We measure time-to-solution for a number of common scientific applications on our virtual cluster against equivalent runs on Stampede2 and develop an application profile where performance is similar or otherwise acceptable. For such applications, the virtual cluster provides an effective form of "cloud bursting" with the potential to significantly improve overall turnaround time, particularly when Stampede2 is experiencing long queue wait times. In addition, the virtual cluster can be used for test and debug without directly impacting Stampede2. We conclude with a discussion of how science gateways can leverage the TACC Jobs API web service to incorporate this cloud bursting technique transparently to the end user.Comment: 6 pages, 0 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

    Parallel Astronomical Data Processing with Python: Recipes for multicore machines

    Full text link
    High performance computing has been used in various fields of astrophysical research. But most of it is implemented on massively parallel systems (supercomputers) or graphical processing unit clusters. With the advent of multicore processors in the last decade, many serial software codes have been re-implemented in parallel mode to utilize the full potential of these processors. In this paper, we propose parallel processing recipes for multicore machines for astronomical data processing. The target audience are astronomers who are using Python as their preferred scripting language and who may be using PyRAF/IRAF for data processing. Three problems of varied complexity were benchmarked on three different types of multicore processors to demonstrate the benefits, in terms of execution time, of parallelizing data processing tasks. The native multiprocessing module available in Python makes it a relatively trivial task to implement the parallel code. We have also compared the three multiprocessing approaches - Pool/Map, Process/Queue, and Parallel Python. Our test codes are freely available and can be downloaded from our website.Comment: 15 pages, 7 figures, 1 table, "for associated test code, see http://astro.nuigalway.ie/staff/navtejs", Accepted for publication in Astronomy and Computin
    • …
    corecore