390 research outputs found
A new tool for the performance analysis of massively parallel computer systems
We present a new tool, GPA, that can generate key performance measures for
very large systems. Based on solving systems of ordinary differential equations
(ODEs), this method of performance analysis is far more scalable than
stochastic simulation. The GPA tool is the first to produce higher moment
analysis from differential equation approximation, which is essential, in many
cases, to obtain an accurate performance prediction. We identify so-called
switch points as the source of error in the ODE approximation. We investigate
the switch point behaviour in several large models and observe that as the
scale of the model is increased, in general the ODE performance prediction
improves in accuracy. In the case of the variance measure, we are able to
justify theoretically that in the limit of model scale, the ODE approximation
can be expected to tend to the actual variance of the model
Asynchronous iterative solution for dominant eigenvectors with applications in performance modelling and PageRank
Imperial Users onl
Patterns of Scalable Bayesian Inference
Datasets are growing not just in size but in complexity, creating a demand
for rich models and quantification of uncertainty. Bayesian methods are an
excellent fit for this demand, but scaling Bayesian inference is a challenge.
In response to this challenge, there has been considerable recent work based on
varying assumptions about model structure, underlying computational resources,
and the importance of asymptotic correctness. As a result, there is a zoo of
ideas with few clear overarching principles.
In this paper, we seek to identify unifying principles, patterns, and
intuitions for scaling Bayesian inference. We review existing work on utilizing
modern computing resources with both MCMC and variational approximation
techniques. From this taxonomy of ideas, we characterize the general principles
that have proven successful for designing scalable inference procedures and
comment on the path forward
Scalable Performance Analysis of Massively Parallel Stochastic Systems
The accurate performance analysis of large-scale computer and communication systems is directly
inhibited by an exponential growth in the state-space of the underlying Markovian performance
model. This is particularly true when considering massively-parallel architectures
such as cloud or grid computing infrastructures. Nevertheless, an ability to extract quantitative
performance measures such as passage-time distributions from performance models of
these systems is critical for providers of these services. Indeed, without such an ability, they
remain unable to offer realistic end-to-end service level agreements (SLAs) which they can have
any confidence of honouring. Additionally, this must be possible in a short enough period of
time to allow many different parameter combinations in a complex system to be tested. If we
can achieve this rapid performance analysis goal, it will enable service providers and engineers
to determine the cost-optimal behaviour which satisfies the SLAs.
In this thesis, we develop a scalable performance analysis framework for the grouped PEPA
stochastic process algebra. Our approach is based on the approximation of key model quantities
such as means and variances by tractable systems of ordinary differential equations (ODEs).
Crucially, the size of these systems of ODEs is independent of the number of interacting entities
within the model, making these analysis techniques extremely scalable. The reliability of our
approach is directly supported by convergence results and, in some cases, explicit error bounds.
We focus on extracting passage-time measures from performance models since these are very
commonly the language in which a service level agreement is phrased. We design scalable analysis
techniques which can handle passages defined both in terms of entire component populations
as well as individual or tagged members of a large population.
A precise and straightforward specification of a passage-time service level agreement is as important
to the performance engineering process as its evaluation. This is especially true of
large and complex models of industrial-scale systems. To address this, we introduce the unified
stochastic probe framework. Unified stochastic probes are used to generate a model augmentation
which exposes explicitly the SLA measure of interest to the analysis toolkit. In this thesis,
we deploy these probes to define many detailed and derived performance measures that can
be automatically and directly analysed using rapid ODE techniques. In this way, we tackle
applicable problems at many levels of the performance engineering process: from specification
and model representation to efficient and scalable analysis
Research in progress in applied mathematics, numerical analysis, fluid mechanics, and computer science
This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science during the period October 1, 1993 through March 31, 1994. The major categories of the current ICASE research program are: (1) applied and numerical mathematics, including numerical analysis and algorithm development; (2) theoretical and computational research in fluid mechanics in selected areas of interest to LaRC, including acoustics and combustion; (3) experimental research in transition and turbulence and aerodynamics involving LaRC facilities and scientists; and (4) computer science
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
ICASE
This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in the areas of (1) applied and numerical mathematics, including numerical analysis and algorithm development; (2) theoretical and computational research in fluid mechanics in selected areas of interest, including acoustics and combustion; (3) experimental research in transition and turbulence and aerodynamics involving Langley facilities and scientists; and (4) computer science
Parallelization of Stochastic Evolution
The complexity involved in VLSI design and its sub-problems has always made them ideal application areas for non-eterministic iterative heuristics. However, the major drawback has been the large runtime involved in reaching acceptable solutions especially in the case of multi-objective optimization problems. Among the acceleration techniques proposed, parallelization of iterative heuristics is a promising one. The motivation for Parallel CAD include faster runtimes, handling of larger problem sizes, and exploration of larger search space. In this work, the development of parallel algorithms for Stochastic Evolution, applied on multi-objective VLSI cell-placement problem is presented. In VLSI circuit design, placement is the process of arranging circuit blocks on a layout. In standard cell design, placement consists of determining optimum positions of all blocks on the layout to satisfy the constraint and improve a number of objectives. The placement objectives in our work are to reduce power dissipation and wire-length while improving performance (timing). The parallelization is achieved on a cluster of workstations interconnected by a low-latency network, by using MPI communication libraries. Circuits from ISCAS-89 are used as benchmarks. Results for parallel Stochastic Evolution are compared with its sequential counterpart as well as with the results achieved by parallel versions of Simulated Annealing as a reference point for both, the quality of solution as well as the execution time. After parallelization, linear and super linear speed-ups were obtained, with no degradation in quality of the solution
Parallelization of Stochastic Evolution
The complexity involved in VLSI design and its sub-problems has always made them ideal application areas for non-eterministic iterative heuristics. However, the major drawback has been the large runtime involved in reaching acceptable solutions especially in the case of multi-objective optimization problems. Among the acceleration techniques proposed, parallelization of iterative heuristics is a promising one. The motivation for Parallel CAD include faster runtimes, handling of larger problem sizes, and exploration of larger search space. In this work, the development of parallel algorithms for Stochastic Evolution, applied on multi-objective VLSI cell-placement problem is presented. In VLSI circuit design, placement is the process of arranging circuit blocks on a layout. In standard cell design, placement consists of determining optimum positions of all blocks on the layout to satisfy the constraint and improve a number of objectives. The placement objectives in our work are to reduce power dissipation and wire-length while improving performance (timing). The parallelization is achieved on a cluster of workstations interconnected by a low-latency network, by using MPI communication libraries. Circuits from ISCAS-89 are used as benchmarks. Results for parallel Stochastic Evolution are compared with its sequential counterpart as well as with the results achieved by parallel versions of Simulated Annealing as a reference point for both, the quality of solution as well as the execution time. After parallelization, linear and super linear speed-ups were obtained, with no degradation in quality of the solution
- …