1,790 research outputs found

    A Pattern Language for High-Performance Computing Resilience

    Full text link
    High-performance computing systems (HPC) provide powerful capabilities for modeling, simulation, and data analytics for a broad class of computational problems. They enable extreme performance of the order of quadrillion floating-point arithmetic calculations per second by aggregating the power of millions of compute, memory, networking and storage components. With the rapidly growing scale and complexity of HPC systems for achieving even greater performance, ensuring their reliable operation in the face of system degradations and failures is a critical challenge. System fault events often lead the scientific applications to produce incorrect results, or may even cause their untimely termination. The sheer number of components in modern extreme-scale HPC systems and the complex interactions and dependencies among the hardware and software components, the applications, and the physical environment makes the design of practical solutions that support fault resilience a complex undertaking. To manage this complexity, we developed a methodology for designing HPC resilience solutions using design patterns. We codified the well-known techniques for handling faults, errors and failures that have been devised, applied and improved upon over the past three decades in the form of design patterns. In this paper, we present a pattern language to enable a structured approach to the development of HPC resilience solutions. The pattern language reveals the relations among the resilience patterns and provides the means to explore alternative techniques for handling a specific fault model that may have different efficiency and complexity characteristics. Using the pattern language enables the design and implementation of comprehensive resilience solutions as a set of interconnected resilience patterns that can be instantiated across layers of the system stack.Comment: Proceedings of the 22nd European Conference on Pattern Languages of Program

    Experiments in fault tolerant software reliability

    Get PDF
    The reliability of voting was evaluated in a fault-tolerant software system for small output spaces. The effectiveness of the back-to-back testing process was investigated. Version 3.0 of the RSDIMU-ATS, a semi-automated test bed for certification testing of RSDIMU software, was prepared and distributed. Software reliability estimation methods based on non-random sampling are being studied. The investigation of existing fault-tolerance models was continued and formulation of new models was initiated

    Realtime reservoir characterization and beyond: cyber-infrastructure tools and technologies

    Get PDF
    The advent of the digital oil _x000C_eld and rapidly decreasing cost of computing creates opportunities as well as challenges in simulation based reservoir studies, in particular, real-time reservoir characterization and optimization. One challenge our e_x000B_orts are directed toward is the use of real-time production data to perform live reservoir characterization using high throughput, high performance computing environments. To that end we developed the required tools of parallel reservoir simulator, parallel ensemble Kalman _x000C_lter and a scalable work ow manager. When using this collection of tools, a reservoir modeler is able to perform large scale reservoir management studies in short periods of time. This includes studies with thousands of models that are individually complex and large, involving millions of degrees of freedom. Using parallel processing, we are able to solve these models much faster than we otherwise would on a single, serial machine. This motivated the development of a fast parallel reservoir simulator. Furthermore, distributing those simulations across resources leads to a smaller total time to completion by making use of distributed processing. This allows the development of a scalable high throughput work ow manager. Finally, with thousands of models, each with millions of degrees of freedom, we end up with a super uity of model parameters. This translates directly to billions of degrees of freedom in the reservoir study. To be able to use the ensemble Kalman _x000C_lter on these models, we needed to develop a parallel implementation of the ensemble Kalman _x000C_lter. This thesis discusses the enabling tools and technologies developed to address a speci _x000C_c problem: how to accurately characterize reservoirs, using large numbers of complex detailed models. For these characterization studies to be helpful in making production decisions, the time to solution must be feasible. To that end, our work is focused on developing and extending these tools, and optimizing their performance

    Machine Learning for High-entropy Alloys: Progress, Challenges and Opportunities

    Full text link
    High-entropy alloys (HEAs) have attracted extensive interest due to their exceptional mechanical properties and the vast compositional space for new HEAs. However, understanding their novel physical mechanisms and then using these mechanisms to design new HEAs are confronted with their high-dimensional chemical complexity, which presents unique challenges to (i) the theoretical modeling that needs accurate atomic interactions for atomistic simulations and (ii) constructing reliable macro-scale models for high-throughput screening of vast amounts of candidate alloys. Machine learning (ML) sheds light on these problems with its capability to represent extremely complex relations. This review highlights the success and promising future of utilizing ML to overcome these challenges. We first introduce the basics of ML algorithms and application scenarios. We then summarize the state-of-the-art ML models describing atomic interactions and atomistic simulations of thermodynamic and mechanical properties. Special attention is paid to phase predictions, planar-defect calculations, and plastic deformation simulations. Next, we review ML models for macro-scale properties, such as lattice structures, phase formations, and mechanical properties. Examples of machine-learned phase-formation rules and order parameters are used to illustrate the workflow. Finally, we discuss the remaining challenges and present an outlook of research directions, including uncertainty quantification and ML-guided inverse materials design.Comment: This review paper has been accepted by Progress in Materials Scienc

    Unlocking ultrastrong high-temperature ceramics: Beyond Equimolar Compositions in High Entropy Nitrides

    Full text link
    Traditionally, increasing compositional complexity and chemical diversity of high entropy alloy ceramics whilst maintaining a stable single-phase solid solution has been a primary design strategy for the development of new ceramics. However, only a handful have shown properties that justify the increased alloying content. Here, we unveil a groundbreaking strategy based on deviation from conventional equimolar composition towards non-equimolar composition space, enabling tuning the metastability level of the supersaturated single-phase solid solution. By employing high-temperature micromechanical testing of refractory metal-based high entropy nitrides, we found that the activation of an additional strengthening mechanism upon metastable phase decomposition propels the yield strength of a non-equimolar nitride at 1000 C to a staggering 6.9 GPa, that is 30 % higher than the most robust equimolar nitride. We show that the inherent instability triggers the decomposition of the solid solution with non-equimolar composition at high temperatures, inducing strengthening due to the coherency stress of a spinodally modulated structure, combined with the lattice resistance of the product solid solution phase. In stark contrast, the strength of equimolar systems, boasting diverse chemical compositions, declines as a function of temperature due to the weakening of the lattice resistance and the absence of other strengthening mechanisms.Comment: 17 pages, 4 figures, 25 supplementary pages, 19 supplementary figures, 1 Supplementary Tabl

    High-Entropy Alloys for Advanced Nuclear Applications

    Get PDF
    The expanded compositional freedom afforded by high-entropy alloys (HEAs) represents a unique opportunity for the design of alloys for advanced nuclear applications, in particular for applications where current engineering alloys fall short. This review assesses the work done to date in the field of HEAs for nuclear applications, provides critical insight into the conclusions drawn, and highlights possibilities and challenges for future study. It is found that our understanding of the irradiation responses of HEAs remains in its infancy, and much work is needed in order for our knowledge of any single HEA system to match our understanding of conventional alloys such as austenitic steels. A number of studies have suggested that HEAs possess ‘special’ irradiation damage resistance, although some of the proposed mechanisms, such as those based on sluggish diffusion and lattice distortion, remain somewhat unconvincing (certainly in terms of being universally applicable to all HEAs). Nevertheless, there may be some mechanisms and effects that are uniquely different in HEAs when compared to more conventional alloys, such as the effect that their poor thermal conductivities have on the displacement cascade. Furthermore, the opportunity to tune the compositions of HEAs over a large range to optimise particular irradiation responses could be very powerful, even if the design process remains challenging
    • …
    corecore