518 research outputs found

    Extended Differential Aggregations in Process Algebra for Performance and Biology

    Get PDF
    We study aggregations for ordinary differential equations induced by fluid semantics for Markovian process algebra which can capture the dynamics of performance models and chemical reaction networks. Whilst previous work has required perfect symmetry for exact aggregation, we present approximate fluid lumpability, which makes nearby processes perfectly symmetric after a perturbation of their parameters. We prove that small perturbations yield nearby differential trajectories. Numerically, we show that many heterogeneous processes can be aggregated with negligible errors.Comment: In Proceedings QAPL 2014, arXiv:1406.156

    An Improved Multi-Stage Preconditioner on GPUs for Compositional Reservoir Simulation

    Full text link
    The compositional model is often used to describe multicomponent multiphase porous media flows in the petroleum industry. The fully implicit method with strong stability and weak constraints on time-step sizes is commonly used in the mainstream commercial reservoir simulators. In this paper, we develop an efficient multi-stage preconditioner for the fully implicit compositional flow simulation. The method employs an adaptive setup phase to improve the parallel efficiency on GPUs. Furthermore, a multi-color Gauss-Seidel algorithm based on the adjacency matrix is applied in the algebraic multigrid methods for the pressure part. Numerical results demonstrate that the proposed algorithm achieves good parallel speedup while yields the same convergence behavior as the corresponding sequential version.Comment: 24 pages, 4 figures, and 8 tables. arXiv admin note: text overlap with arXiv:2201.0197

    Scalable Performance Analysis of Massively Parallel Stochastic Systems

    No full text
    The accurate performance analysis of large-scale computer and communication systems is directly inhibited by an exponential growth in the state-space of the underlying Markovian performance model. This is particularly true when considering massively-parallel architectures such as cloud or grid computing infrastructures. Nevertheless, an ability to extract quantitative performance measures such as passage-time distributions from performance models of these systems is critical for providers of these services. Indeed, without such an ability, they remain unable to offer realistic end-to-end service level agreements (SLAs) which they can have any confidence of honouring. Additionally, this must be possible in a short enough period of time to allow many different parameter combinations in a complex system to be tested. If we can achieve this rapid performance analysis goal, it will enable service providers and engineers to determine the cost-optimal behaviour which satisfies the SLAs. In this thesis, we develop a scalable performance analysis framework for the grouped PEPA stochastic process algebra. Our approach is based on the approximation of key model quantities such as means and variances by tractable systems of ordinary differential equations (ODEs). Crucially, the size of these systems of ODEs is independent of the number of interacting entities within the model, making these analysis techniques extremely scalable. The reliability of our approach is directly supported by convergence results and, in some cases, explicit error bounds. We focus on extracting passage-time measures from performance models since these are very commonly the language in which a service level agreement is phrased. We design scalable analysis techniques which can handle passages defined both in terms of entire component populations as well as individual or tagged members of a large population. A precise and straightforward specification of a passage-time service level agreement is as important to the performance engineering process as its evaluation. This is especially true of large and complex models of industrial-scale systems. To address this, we introduce the unified stochastic probe framework. Unified stochastic probes are used to generate a model augmentation which exposes explicitly the SLA measure of interest to the analysis toolkit. In this thesis, we deploy these probes to define many detailed and derived performance measures that can be automatically and directly analysed using rapid ODE techniques. In this way, we tackle applicable problems at many levels of the performance engineering process: from specification and model representation to efficient and scalable analysis

    Finding Neurons in a Haystack: Case Studies with Sparse Probing

    Full text link
    Despite rapid adoption and deployment of large language models (LLMs), the internal computations of these models remain opaque and poorly understood. In this work, we seek to understand how high-level human-interpretable features are represented within the internal neuron activations of LLMs. We train kk-sparse linear classifiers (probes) on these internal activations to predict the presence of features in the input; by varying the value of kk we study the sparsity of learned representations and how this varies with model scale. With k=1k=1, we localize individual neurons which are highly relevant for a particular feature, and perform a number of case studies to illustrate general properties of LLMs. In particular, we show that early layers make use of sparse combinations of neurons to represent many features in superposition, that middle layers have seemingly dedicated neurons to represent higher-level contextual features, and that increasing scale causes representational sparsity to increase on average, but there are multiple types of scaling dynamics. In all, we probe for over 100 unique features comprising 10 different categories in 7 different models spanning 70 million to 6.9 billion parameters

    Modular µ-calculus model-checking with formula-dependent hierarchical abstractions

    Get PDF
    International audienceThis paper defines a formal framework for the modular and hierarchical model-checking of µ-calculus against modular transitions systems. Given a formula ϕ, a module can be analysed alone, in such a way that the truth value of ϕ may be decided without the need to analyse other modules. If no conclusion can be drawn locally, the analysis provides information allowing to reduce the module to a smaller one that is equivalent with respect to the truth value of ϕ. This way, modules can be incrementally analysed, reduced and composed to other reduced modules until a conclusion is reached. On the one hand, modular analysis allows to avoid modules compositions and thus the corresponding combinatorial explosion; on the other hand, hierarchical analysis allows to reduce the modules that must be composed, which limits combinatorial explosion. Moreover, by proposing three complementary formula-dependent reductions, we expect better reductions than general approaches like bisimulation or τ * reductions. The current paper is focused on defining the theoretical tools for this approach; finding interesting strategies to apply them efficiently is left to future work

    Model-driven Scheduling for Distributed Stream Processing Systems

    Full text link
    Distributed Stream Processing frameworks are being commonly used with the evolution of Internet of Things(IoT). These frameworks are designed to adapt to the dynamic input message rate by scaling in/out.Apache Storm, originally developed by Twitter is a widely used stream processing engine while others includes Flink, Spark streaming. For running the streaming applications successfully there is need to know the optimal resource requirement, as over-estimation of resources adds extra cost.So we need some strategy to come up with the optimal resource requirement for a given streaming application. In this article, we propose a model-driven approach for scheduling streaming applications that effectively utilizes a priori knowledge of the applications to provide predictable scheduling behavior. Specifically, we use application performance models to offer reliable estimates of the resource allocation required. Further, this intuition also drives resource mapping, and helps narrow the estimated and actual dataflow performance and resource utilization. Together, this model-driven scheduling approach gives a predictable application performance and resource utilization behavior for executing a given DSPS application at a target input stream rate on distributed resources.Comment: 54 page
    • …
    corecore