5,162 research outputs found

    Open-architecture Implementation of Fragment Molecular Orbital Method for Peta-scale Computing

    Full text link
    We present our perspective and goals on highperformance computing for nanoscience in accordance with the global trend toward "peta-scale computing." After reviewing our results obtained through the grid-enabled version of the fragment molecular orbital method (FMO) on the grid testbed by the Japanese Grid Project, National Research Grid Initiative (NAREGI), we show that FMO is one of the best candidates for peta-scale applications by predicting its effective performance in peta-scale computers. Finally, we introduce our new project constructing a peta-scale application in an open-architecture implementation of FMO in order to realize both goals of highperformance in peta-scale computers and extendibility to multiphysics simulations.Comment: 6 pages, 9 figures, proceedings of the 2nd IEEE/ACM international workshop on high performance computing for nano-science and technology (HPCNano06

    Advancing nanoelectronic device modeling through peta-scale computing and deployment on nanoHUB

    Get PDF
    Recent improvements to existing HPC codes NEMO 3-D and OMEN, combined with access to peta-scale computing resources, have enabled realistic device engineering simulations that were previously infeasible. NEMO 3-D can now simulate 1 billion atom systems, and, using 3D spatial decomposition, scale to 32768 cores. Simulation time for the band structure of an experimental P doped Si quantum computing device fell from 40 minutes to I minute. OMEN can perform fully quantum mechanical transport calculations for real-word UTB FETs on 147,456 cores in roughly 5 minutes. Both of these tools power simulation engines on the nanoHUB, giving the community access to previously unavailable research capabilities

    Energy Wall for Exascale Supercomputing

    Get PDF
    "Sustainable development" is one of the major issues in the 21st century. Thus the notions of green computing, green development and so on show up one after another. As the large-scale parallel computing systems develop rapidly, energy consumption of such systems is becoming very huge, especially system performance reaches Petascale (10^15 Flops) or even Exascale (10^18 Flops). The huge energy consumption increases the system temperature, which seriously undermines the stability and reliability, and limits the growth of system size. The effects of energy consumption on scalability become a growing concern. Against the background, this paper proposes the concept of "Energy Wall" to highlight the significance of achieving scalable performance in peta/exascale supercomputing by taking energy consumption into account. We quantify the effect of energy consumption on scalability by building the energy-efficiency speedup model, which integrates computing performance and system energy. We define the energy wall quantitatively, and provide the theorem on the existence of the energy wall, and categorize the large-scale parallel computers according to the energy consumption. In the context of several representative types of HPC applications, we analyze and extrapolate the existence of the energy wall considering three kinds of topologies, 3D-Torus, binary n-cube and Fat tree which provides insights on how to mitigate the energy wall effect in system design and through hardware/software optimization in peta/exascale supercomputing

    PetaFlow: a global computing-networking-visualisation unitwith social impact

    Get PDF
    International audienceThe PetaFlow application aims to contribute to the use of high performance computational resources forthe benefit of society. To this goal the emergence of adequate information and communication technologies withrespect to high performance computing-networking-visualisation and their mutual awareness is required. Thedeveloped technology and algorithms are presented and applied to a real global peta-scale data intensive scientificproblem with social and medical importance, i.e. human upper airflow modelling

    Reliable High Performance Peta- and Exa-Scale Computing

    Get PDF
    As supercomputers become larger and more powerful, they are growing increasingly complex. This is reflected both in the exponentially increasing numbers of components in HPC systems (LLNL is currently installing the 1.6 million core Sequoia system) as well as the wide variety of software and hardware components that a typical system includes. At this scale it becomes infeasible to make each component sufficiently reliable to prevent regular faults somewhere in the system or to account for all possible cross-component interactions. The resulting faults and instability cause HPC applications to crash, perform sub-optimally or even produce erroneous results. As supercomputers continue to approach Exascale performance and full system reliability becomes prohibitively expensive, we will require novel techniques to bridge the gap between the lower reliability provided by hardware systems and users unchanging need for consistent performance and reliable results. Previous research on HPC system reliability has developed various techniques for tolerating and detecting various types of faults. However, these techniques have seen very limited real applicability because of our poor understanding of how real systems are affected by complex faults such as soft fault-induced bit flips or performance degradations. Prior work on such techniques has had very limited practical utility because it has generally focused on analyzing the behavior of entire software/hardware systems both during normal operation and in the face of faults. Because such behaviors are extremely complex, such studies have only produced coarse behavioral models of limited sets of software/hardware system stacks. Since this provides little insight into the many different system stacks and applications used in practice, this work has had little real-world impact. My project addresses this problem by developing a modular methodology to analyze the behavior of applications and systems during both normal and faulty operation. By synthesizing models of individual components into a whole-system behavior models my work is making it possible to automatically understand the behavior of arbitrary real-world systems to enable them to tolerate a wide range of system faults. My project is following a multi-pronged research strategy. Section II discusses my work on modeling the behavior of existing applications and systems. Section II.A discusses resilience in the face of soft faults and Section II.B looks at techniques to tolerate performance faults. Finally Section III presents an alternative approach that studies how a system should be designed from the ground up to make resilience natural and easy

    Multi-physics Extension of OpenFMO Framework

    Full text link
    OpenFMO framework, an open-source software (OSS) platform for Fragment Molecular Orbital (FMO) method, is extended to multi-physics simulations (MPS). After reviewing the several FMO implementations on distributed computer environments, the subsequent development planning corresponding to MPS is presented. It is discussed which should be selected as a scientific software, lightweight and reconfigurable form or large and self-contained form.Comment: 4 pages with 11 figure files, to appear in the Proceedings of ICCMSE 200

    Adaptive control in rollforward recovery for extreme scale multigrid

    Full text link
    With the increasing number of compute components, failures in future exa-scale computer systems are expected to become more frequent. This motivates the study of novel resilience techniques. Here, we extend a recently proposed algorithm-based recovery method for multigrid iterations by introducing an adaptive control. After a fault, the healthy part of the system continues the iterative solution process, while the solution in the faulty domain is re-constructed by an asynchronous on-line recovery. The computations in both the faulty and healthy subdomains must be coordinated in a sensitive way, in particular, both under and over-solving must be avoided. Both of these waste computational resources and will therefore increase the overall time-to-solution. To control the local recovery and guarantee an optimal re-coupling, we introduce a stopping criterion based on a mathematical error estimator. It involves hierarchical weighted sums of residuals within the context of uniformly refined meshes and is well-suited in the context of parallel high-performance computing. The re-coupling process is steered by local contributions of the error estimator. We propose and compare two criteria which differ in their weights. Failure scenarios when solving up to 6.9â‹…10116.9\cdot10^{11} unknowns on more than 245\,766 parallel processes will be reported on a state-of-the-art peta-scale supercomputer demonstrating the robustness of the method

    An Extensible Timing Infrastructure for Adaptive Large-scale Applications

    Full text link
    Real-time access to accurate and reliable timing information is necessary to profile scientific applications, and crucial as simulations become increasingly complex, adaptive, and large-scale. The Cactus Framework provides flexible and extensible capabilities for timing information through a well designed infrastructure and timing API. Applications built with Cactus automatically gain access to built-in timers, such as gettimeofday and getrusage, system-specific hardware clocks, and high-level interfaces such as PAPI. We describe the Cactus timer interface, its motivation, and its implementation. We then demonstrate how this timing information can be used by an example scientific application to profile itself, and to dynamically adapt itself to a changing environment at run time
    • …