518 research outputs found
Extended Differential Aggregations in Process Algebra for Performance and Biology
We study aggregations for ordinary differential equations induced by fluid
semantics for Markovian process algebra which can capture the dynamics of
performance models and chemical reaction networks. Whilst previous work has
required perfect symmetry for exact aggregation, we present approximate fluid
lumpability, which makes nearby processes perfectly symmetric after a
perturbation of their parameters. We prove that small perturbations yield
nearby differential trajectories. Numerically, we show that many heterogeneous
processes can be aggregated with negligible errors.Comment: In Proceedings QAPL 2014, arXiv:1406.156
An Improved Multi-Stage Preconditioner on GPUs for Compositional Reservoir Simulation
The compositional model is often used to describe multicomponent multiphase
porous media flows in the petroleum industry. The fully implicit method with
strong stability and weak constraints on time-step sizes is commonly used in
the mainstream commercial reservoir simulators. In this paper, we develop an
efficient multi-stage preconditioner for the fully implicit compositional flow
simulation. The method employs an adaptive setup phase to improve the parallel
efficiency on GPUs. Furthermore, a multi-color Gauss-Seidel algorithm based on
the adjacency matrix is applied in the algebraic multigrid methods for the
pressure part. Numerical results demonstrate that the proposed algorithm
achieves good parallel speedup while yields the same convergence behavior as
the corresponding sequential version.Comment: 24 pages, 4 figures, and 8 tables. arXiv admin note: text overlap
with arXiv:2201.0197
Scalable Performance Analysis of Massively Parallel Stochastic Systems
The accurate performance analysis of large-scale computer and communication systems is directly
inhibited by an exponential growth in the state-space of the underlying Markovian performance
model. This is particularly true when considering massively-parallel architectures
such as cloud or grid computing infrastructures. Nevertheless, an ability to extract quantitative
performance measures such as passage-time distributions from performance models of
these systems is critical for providers of these services. Indeed, without such an ability, they
remain unable to offer realistic end-to-end service level agreements (SLAs) which they can have
any confidence of honouring. Additionally, this must be possible in a short enough period of
time to allow many different parameter combinations in a complex system to be tested. If we
can achieve this rapid performance analysis goal, it will enable service providers and engineers
to determine the cost-optimal behaviour which satisfies the SLAs.
In this thesis, we develop a scalable performance analysis framework for the grouped PEPA
stochastic process algebra. Our approach is based on the approximation of key model quantities
such as means and variances by tractable systems of ordinary differential equations (ODEs).
Crucially, the size of these systems of ODEs is independent of the number of interacting entities
within the model, making these analysis techniques extremely scalable. The reliability of our
approach is directly supported by convergence results and, in some cases, explicit error bounds.
We focus on extracting passage-time measures from performance models since these are very
commonly the language in which a service level agreement is phrased. We design scalable analysis
techniques which can handle passages defined both in terms of entire component populations
as well as individual or tagged members of a large population.
A precise and straightforward specification of a passage-time service level agreement is as important
to the performance engineering process as its evaluation. This is especially true of
large and complex models of industrial-scale systems. To address this, we introduce the unified
stochastic probe framework. Unified stochastic probes are used to generate a model augmentation
which exposes explicitly the SLA measure of interest to the analysis toolkit. In this thesis,
we deploy these probes to define many detailed and derived performance measures that can
be automatically and directly analysed using rapid ODE techniques. In this way, we tackle
applicable problems at many levels of the performance engineering process: from specification
and model representation to efficient and scalable analysis
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Despite rapid adoption and deployment of large language models (LLMs), the
internal computations of these models remain opaque and poorly understood. In
this work, we seek to understand how high-level human-interpretable features
are represented within the internal neuron activations of LLMs. We train
-sparse linear classifiers (probes) on these internal activations to predict
the presence of features in the input; by varying the value of we study the
sparsity of learned representations and how this varies with model scale. With
, we localize individual neurons which are highly relevant for a
particular feature, and perform a number of case studies to illustrate general
properties of LLMs. In particular, we show that early layers make use of sparse
combinations of neurons to represent many features in superposition, that
middle layers have seemingly dedicated neurons to represent higher-level
contextual features, and that increasing scale causes representational sparsity
to increase on average, but there are multiple types of scaling dynamics. In
all, we probe for over 100 unique features comprising 10 different categories
in 7 different models spanning 70 million to 6.9 billion parameters
Modular µ-calculus model-checking with formula-dependent hierarchical abstractions
International audienceThis paper defines a formal framework for the modular and hierarchical model-checking of µ-calculus against modular transitions systems. Given a formula ϕ, a module can be analysed alone, in such a way that the truth value of ϕ may be decided without the need to analyse other modules. If no conclusion can be drawn locally, the analysis provides information allowing to reduce the module to a smaller one that is equivalent with respect to the truth value of ϕ. This way, modules can be incrementally analysed, reduced and composed to other reduced modules until a conclusion is reached. On the one hand, modular analysis allows to avoid modules compositions and thus the corresponding combinatorial explosion; on the other hand, hierarchical analysis allows to reduce the modules that must be composed, which limits combinatorial explosion. Moreover, by proposing three complementary formula-dependent reductions, we expect better reductions than general approaches like bisimulation or τ * reductions. The current paper is focused on defining the theoretical tools for this approach; finding interesting strategies to apply them efficiently is left to future work
Model-driven Scheduling for Distributed Stream Processing Systems
Distributed Stream Processing frameworks are being commonly used with the
evolution of Internet of Things(IoT). These frameworks are designed to adapt to
the dynamic input message rate by scaling in/out.Apache Storm, originally
developed by Twitter is a widely used stream processing engine while others
includes Flink, Spark streaming. For running the streaming applications
successfully there is need to know the optimal resource requirement, as
over-estimation of resources adds extra cost.So we need some strategy to come
up with the optimal resource requirement for a given streaming application. In
this article, we propose a model-driven approach for scheduling streaming
applications that effectively utilizes a priori knowledge of the applications
to provide predictable scheduling behavior. Specifically, we use application
performance models to offer reliable estimates of the resource allocation
required. Further, this intuition also drives resource mapping, and helps
narrow the estimated and actual dataflow performance and resource utilization.
Together, this model-driven scheduling approach gives a predictable application
performance and resource utilization behavior for executing a given DSPS
application at a target input stream rate on distributed resources.Comment: 54 page
- …