101 research outputs found

    On A Simpler and Faster Derivation of Single Use Reliability Mean and Variance for Model-Based Statistical Testing

    Get PDF
    Markov chain usage-based statistical testing has proved sound and effective in providing audit trails of evidence in certifying software-intensive systems. The system end-toend reliability is derived analytically in closed form, following an arc-based Bayesian model. System reliability is represented by an important statistic called single use reliability, and defined as the probability of a randomly selected use being successful. This paper continues our earlier work on a simpler and faster derivation of the single use reliability mean, and proposes a new derivation of the single use reliability variance by applying a well-known theorem and eliminating the need to compute the second moments of arc failure probabilities. Our new results complete a new analysis that could be shown to be simpler, faster, and more direct while also rendering a more intuitive explanation. Our new theory is illustrated with three simple Markov chain usage models with manual derivations and experimental results

    Static and Dynamic Scheduling for Effective Use of Multicore Systems

    Get PDF
    Multicore systems have increasingly gained importance in high performance computers. Compared to the traditional microarchitectures, multicore architectures have a simpler design, higher performance-to-area ratio, and improved power efficiency. Although the multicore architecture has various advantages, traditional parallel programming techniques do not apply to the new architecture efficiently. This dissertation addresses how to determine optimized thread schedules to improve data reuse on shared-memory multicore systems and how to seek a scalable solution to designing parallel software on both shared-memory and distributed-memory multicore systems. We propose an analytical cache model to predict the number of cache misses on the time-sharing L2 cache on a multicore processor. The model provides an insight into the impact of cache sharing and cache contention between threads. Inspired by the model, we build the framework of affinity based thread scheduling to determine optimized thread schedules to improve data reuse on all the levels in a complex memory hierarchy. The affinity based thread scheduling framework includes a model to estimate the cost of a thread schedule, which consists of three submodels: an affinity graph submodel, a memory hierarchy submodel, and a cost submodel. Based on the model, we design a hierarchical graph partitioning algorithm to determine near-optimal solutions. We have also extended the algorithm to support threads with data dependences. The algorithms are implemented and incorporated into a feedback directed optimization prototype system. The prototype system builds upon a binary instrumentation tool and can improve program performance greatly on shared-memory multicore architectures. We also study the dynamic data-availability driven scheduling approach to designing new parallel software on distributed-memory multicore architectures. We have implemented a decentralized dynamic runtime system. The design of the runtime system is focused on the scalability metric. At any time only a small portion of a task graph exists in memory. We propose an algorithm to solve data dependences without process cooperation in a distributed manner. Our experimental results demonstrate the scalability and practicality of the approach for both shared-memory and distributed-memory multicore systems. Finally, we present a scalable nonblocking topology-aware multicast scheme for distributed DAG scheduling applications

    Correcting soft errors online in fast fourier transform

    Get PDF
    While many algorithm-based fault tolerance (ABFT) schemes have been proposed to detect soft errors offline in the fast Fourier transform (FFT) after computation finishes, none of the existing ABFT schemes detect soft errors online before the computation finishes. This paper presents an online ABFT scheme for FFT so that soft errors can be detected online and the corrupted computation can be terminated in a much more timely manner. We also extend our scheme to tolerate both arithmetic errors and memory errors, develop strategies to reduce its fault tolerance overhead and improve its numerical stability and fault coverage, and finally incorporate it into the widely used FFTW library - one of the today's fastest FFT software implementations. Experimental results demonstrate that: (1) the proposed online ABFT scheme introduces much lower overhead than the existing offline ABFT schemes; (2) it detects errors in a much more timely manner; and (3) it also has higher numerical stability and better fault coverage

    Building a scientific workflow framework to enable real‐time machine learning and visualization

    Get PDF
    Nowadays, we have entered the era of big data. In the area of high performance computing, large‐scale simulations can generate huge amounts of data with potentially critical information. However, these data are usually saved in intermediate files and are not instantly visible until advanced data analytics techniques are applied after reading all simulation data from persistent storages (eg, local disks or a parallel file system). This approach puts users in a situation where they spend long time on waiting for running simulations while not knowing the status of the running job. In this paper, we build a new computational framework to couple scientific simulations with multi‐step machine learning processes and in‐situ data visualizations. We also design a new scalable simulation‐time clustering algorithm to automatically detect fluid flow anomalies. This computational framework is built upon different software components and provides plug‐in data analysis and visualization functions over complex scientific workflows. With this advanced framework, users can monitor and get real‐time notifications of special patterns or anomalies from ongoing extreme‐scale turbulent flow simulations

    A Real-Time Machine Learning and Visualization Framework for Scientific Workflows

    Get PDF
    High-performance computing resources are currently widely used in science and engineering areas. Typical post-hoc approaches use persistent storage to save produced data from simulation, thus reading from storage to memory is required for data analysis tasks. For large-scale scientific simulations, such I/O operation will produce significant overhead. In-situ/in-transit approaches bypass I/O by accessing and processing in-memory simulation results directly, which suggests simulations and analysis applications should be more closely coupled. This paper constructs a flexible and extensible framework to connect scientific simulations with multi-steps machine learning processes and in-situ visualization tools, thus providing plugged-in analysis and visualization functionality over complex workflows at real time. A distributed simulation-time clustering method is proposed to detect anomalies from real turbulence flows

    A Simpler and More Direct Derivation of System Reliability Using Markov Chain Usage Models

    Get PDF
    Markov chain usage-based statistical testing has been around for more than two decades, and proved sound and effective in providing audit trails of evidence in certifying software-intensive systems. The system end-to-end reliability is derived analytically in closed form, following an arc-based Bayesian model. System reliability is represented by an important statistic called single use reliability, and defined as the probability of a randomly selected use being successful. This paper reviews the analytical derivation of the single use reliability mean, and proposes a simpler, faster, and more direct way to compute the expected value that renders an intuitive explanation. The new derivation is illustrated with two examples

    XScan: An Integrated Tool for Understanding Open Source Community-Based Scientific Code

    Get PDF
    Many scientific communities have adopted community-based models that integrate multiple components to simulate whole system dynamics. The community software projects’ complexity, stems from the integration of multiple individual software components that were developed under different application requirements and various machine architectures, has become a challenge for effective software system understanding and continuous software development. The paper presents an integrated software toolkit called X-ray Software Scanner (in abbreviation, XScan) for a better understanding of large-scale community-based scientific codes. Our software tool provides support to quickly summarize the overall information of scientific codes, including the number of lines of code, programming languages, external library dependencies, as well as architecture-dependent parallel software features. The XScan toolkit also realizes a static software analysis component to collect detailed structural information and provides an interactive visualization and analysis of the functions. We use a large-scale community-based Earth System Model to demonstrate the workflow, functions and visualization of the toolkit. We also discuss the application of advanced graph analytics techniques to assist software modular design and component refactoring

    Interactive 3D simulation for fluid–structure interactions using dual coupled GPUs

    Get PDF
    The scope of this work involves the integration of high-speed parallel computation with interactive, 3D visualization of the lattice-Boltzmann-based immersed boundary method for fluid–structure interaction. An NVIDIA Tesla K40c is used for the computations, while an NVIDIA Quadro K5000 is used for 3D vector field visualization. The simulation can be paused at any time step so that the vector field can be explored. The density and placement of streamlines and glyphs are adjustable by the user, while panning and zooming is controlled by the mouse. The simulation can then be resumed. Unlike most scientific applications in computational fluid dynamics where visualization is performed after the computations, our software allows for real-time visualizations of the flow fields while the computations take place. To the best of our knowledge, such a tool on GPUs for FSI does not exist. Our software can facilitate debugging, enable observation of detailed local fields of flow and deformation while computing, and expedite identification of ‘correct’ parameter combinations in parametric studies for new phenomenon. Therefore, our software is expected to shorten the ‘time to solution’ process and expedite the scientific discoveries via scientific computing
    corecore