390 research outputs found

    A new tool for the performance analysis of massively parallel computer systems

    Full text link
    We present a new tool, GPA, that can generate key performance measures for very large systems. Based on solving systems of ordinary differential equations (ODEs), this method of performance analysis is far more scalable than stochastic simulation. The GPA tool is the first to produce higher moment analysis from differential equation approximation, which is essential, in many cases, to obtain an accurate performance prediction. We identify so-called switch points as the source of error in the ODE approximation. We investigate the switch point behaviour in several large models and observe that as the scale of the model is increased, in general the ODE performance prediction improves in accuracy. In the case of the variance measure, we are able to justify theoretically that in the limit of model scale, the ODE approximation can be expected to tend to the actual variance of the model

    Patterns of Scalable Bayesian Inference

    Full text link
    Datasets are growing not just in size but in complexity, creating a demand for rich models and quantification of uncertainty. Bayesian methods are an excellent fit for this demand, but scaling Bayesian inference is a challenge. In response to this challenge, there has been considerable recent work based on varying assumptions about model structure, underlying computational resources, and the importance of asymptotic correctness. As a result, there is a zoo of ideas with few clear overarching principles. In this paper, we seek to identify unifying principles, patterns, and intuitions for scaling Bayesian inference. We review existing work on utilizing modern computing resources with both MCMC and variational approximation techniques. From this taxonomy of ideas, we characterize the general principles that have proven successful for designing scalable inference procedures and comment on the path forward

    Scalable Performance Analysis of Massively Parallel Stochastic Systems

    No full text
    The accurate performance analysis of large-scale computer and communication systems is directly inhibited by an exponential growth in the state-space of the underlying Markovian performance model. This is particularly true when considering massively-parallel architectures such as cloud or grid computing infrastructures. Nevertheless, an ability to extract quantitative performance measures such as passage-time distributions from performance models of these systems is critical for providers of these services. Indeed, without such an ability, they remain unable to offer realistic end-to-end service level agreements (SLAs) which they can have any confidence of honouring. Additionally, this must be possible in a short enough period of time to allow many different parameter combinations in a complex system to be tested. If we can achieve this rapid performance analysis goal, it will enable service providers and engineers to determine the cost-optimal behaviour which satisfies the SLAs. In this thesis, we develop a scalable performance analysis framework for the grouped PEPA stochastic process algebra. Our approach is based on the approximation of key model quantities such as means and variances by tractable systems of ordinary differential equations (ODEs). Crucially, the size of these systems of ODEs is independent of the number of interacting entities within the model, making these analysis techniques extremely scalable. The reliability of our approach is directly supported by convergence results and, in some cases, explicit error bounds. We focus on extracting passage-time measures from performance models since these are very commonly the language in which a service level agreement is phrased. We design scalable analysis techniques which can handle passages defined both in terms of entire component populations as well as individual or tagged members of a large population. A precise and straightforward specification of a passage-time service level agreement is as important to the performance engineering process as its evaluation. This is especially true of large and complex models of industrial-scale systems. To address this, we introduce the unified stochastic probe framework. Unified stochastic probes are used to generate a model augmentation which exposes explicitly the SLA measure of interest to the analysis toolkit. In this thesis, we deploy these probes to define many detailed and derived performance measures that can be automatically and directly analysed using rapid ODE techniques. In this way, we tackle applicable problems at many levels of the performance engineering process: from specification and model representation to efficient and scalable analysis

    Research in progress in applied mathematics, numerical analysis, fluid mechanics, and computer science

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science during the period October 1, 1993 through March 31, 1994. The major categories of the current ICASE research program are: (1) applied and numerical mathematics, including numerical analysis and algorithm development; (2) theoretical and computational research in fluid mechanics in selected areas of interest to LaRC, including acoustics and combustion; (3) experimental research in transition and turbulence and aerodynamics involving LaRC facilities and scientists; and (4) computer science

    ICASE

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in the areas of (1) applied and numerical mathematics, including numerical analysis and algorithm development; (2) theoretical and computational research in fluid mechanics in selected areas of interest, including acoustics and combustion; (3) experimental research in transition and turbulence and aerodynamics involving Langley facilities and scientists; and (4) computer science

    Parallelization of Stochastic Evolution

    Get PDF
    The complexity involved in VLSI design and its sub-problems has always made them ideal application areas for non-eterministic iterative heuristics. However, the major drawback has been the large runtime involved in reaching acceptable solutions especially in the case of multi-objective optimization problems. Among the acceleration techniques proposed, parallelization of iterative heuristics is a promising one. The motivation for Parallel CAD include faster runtimes, handling of larger problem sizes, and exploration of larger search space. In this work, the development of parallel algorithms for Stochastic Evolution, applied on multi-objective VLSI cell-placement problem is presented. In VLSI circuit design, placement is the process of arranging circuit blocks on a layout. In standard cell design, placement consists of determining optimum positions of all blocks on the layout to satisfy the constraint and improve a number of objectives. The placement objectives in our work are to reduce power dissipation and wire-length while improving performance (timing). The parallelization is achieved on a cluster of workstations interconnected by a low-latency network, by using MPI communication libraries. Circuits from ISCAS-89 are used as benchmarks. Results for parallel Stochastic Evolution are compared with its sequential counterpart as well as with the results achieved by parallel versions of Simulated Annealing as a reference point for both, the quality of solution as well as the execution time. After parallelization, linear and super linear speed-ups were obtained, with no degradation in quality of the solution

    Parallelization of Stochastic Evolution

    Get PDF
    The complexity involved in VLSI design and its sub-problems has always made them ideal application areas for non-eterministic iterative heuristics. However, the major drawback has been the large runtime involved in reaching acceptable solutions especially in the case of multi-objective optimization problems. Among the acceleration techniques proposed, parallelization of iterative heuristics is a promising one. The motivation for Parallel CAD include faster runtimes, handling of larger problem sizes, and exploration of larger search space. In this work, the development of parallel algorithms for Stochastic Evolution, applied on multi-objective VLSI cell-placement problem is presented. In VLSI circuit design, placement is the process of arranging circuit blocks on a layout. In standard cell design, placement consists of determining optimum positions of all blocks on the layout to satisfy the constraint and improve a number of objectives. The placement objectives in our work are to reduce power dissipation and wire-length while improving performance (timing). The parallelization is achieved on a cluster of workstations interconnected by a low-latency network, by using MPI communication libraries. Circuits from ISCAS-89 are used as benchmarks. Results for parallel Stochastic Evolution are compared with its sequential counterpart as well as with the results achieved by parallel versions of Simulated Annealing as a reference point for both, the quality of solution as well as the execution time. After parallelization, linear and super linear speed-ups were obtained, with no degradation in quality of the solution
    corecore