8 research outputs found

    Performability modelling of homogenous and heterogeneous multiserver systems with breakdowns and repairs

    Get PDF
    This thesis presents analytical modelling of homogeneous multi-server systems with reconfiguration and rebooting delays, heterogeneous multi-server systems with one main and several identical servers, and farm paradigm multi-server systems. This thesis also includes a number of other research works such as, fast performability evaluation models of open networks of nodes with repairs and finite queuing capacities, multi-server systems with deferred repairs, and two stage tandem networks with failures, repairs and multiple servers at the second stage. Applications of these for the popular Beowulf cluster systems and memory servers are also accomplished. Existing techniques used in performance evaluation of multi-server systems are investigated and analysed in detail. Pure performance modelling techniques, pure availability models, and performability models are also considered. First, the existing approaches for pure performance modelling are critically analysed with the discussions on merits and demerits. Then relevant terminology is defined and explained. Since the pure performance models tend to be too optimistic and pure availability models are too conservative, performability models are used for the evaluation of multi-server systems. Fault-tolerant multi-server systems can continue service in case of certain failures. If failure does not occur at a critical point (such as breakdown of the head processor of a farm paradigm system) the system continues serving in a degraded mode of operation. In such systems, reconfiguration and/or rebooting delays are expected while a processor is being mapped out from the system. These delay stages are also taken into account in addition to failures and repairs, in the exact performability models that are developed. Two dimensional Markov state space representations of the systems are used for performability modelling. Following the critical analysis of the existing solution techniques, the Spectral Expansion method is chosen for the solution of the models developed. In this work, open queuing networks are also considered. To evaluate their performability, existing modelling approaches are expanded and validated by simulations, for performability analysis of multistage open networks with finite queuing capacities. The performances of two extended modelling approaches are compared in terms of accuracy for open networks with various queuing capacities. Deferred repair strategies are becoming popular because of the cost reductions they can provide. Effects of using deferred repairs are analysed and performability models are provided for homogeneous multi-server systems and highly available farm paradigm multi-server systems. Since one of the random variables is used to represent the number of jobs in one of the queues, analytical models for performance evaluation of two stage tandem networks suffer because of numerical cumbersomeness. Existing approaches for modelling these systems are actually pure performance models since breakdowns and repairs cannot be considered. One way of modelling these systems can be to divide one of the random variables to present both the operative and non-operative states of the server in one dimension. However, this will give rise to state explosion problem severely limiting the maximum queue capacity that can be handled. In order to overcome this problem a new approach is presented for modelling two stage tandem networks in three dimensions. An approximate solution is presented to solve such a system. This approach manifests itself as a novel contribution for alleviating the state space explosion problem for large and/or complex systems. When two state tandem networks with feedback are modelled using this approach, the operative states can be handled independently and this makes it possible to consider multiple operative states at the second stage. The analytical models presented can be used with various parameters and they are extendible to consider systems with similar architectures. The developed three dimensional approach is capable to handle two stage tandem networks with various characteristics for performability measures. All the approaches presented give accurate results. Numerical solutions are presented for all models developed. In case the solution presented is not exact, simulations are performed to validate the accuracy of the results obtained

    3D analytical modelling and iterative solution for high performance computing clusters

    Get PDF
    Mobile Cloud Computing enables the migration of services to the edge of the Internet. Therefore, high-performance computing clusters are widely deployed to improve computational capabilities of such environments. However, they are prone to failures and need analytical models to predict their behaviour in order to deliver desired quality-of-service and quality-of-experience to mobile users. This paper proposes a 3D analytical model and a problem-solving approach for sustainability evaluation of high-performance computing clusters. The proposed solution uses an iterative approach to obtain performance measurements to overcome the state space explosion problem. The availability modelling and evaluation of master and computing nodes are performed using a multi-repairman approach. The optimum number of repairmen is also obtained to get realistic results and reduce the overall cost. The proposed model is validated using discrete event simulation. The analytical approach is much faster and in good agreement with the simulations. The analysis focuses on mean queue length, throughput, and mean response time outputs. The maximum differences between analytical and simulation results in the considered scenarios of up to a billion states are less than1.149%,3.82%, and3.76%respectively. These differences are well within the5%of confidence interval of the simulation and the proposed model

    Analytical modelling and simulation of small scale, typical and highly available Beowulf clusters with breakdowns and repairs

    No full text
    Beowulf clusters are very popular because of the high computational power they can provide at reasonably low costs. However, the most pressing issues of today’s cluster solutions are the need for high availability and performance. Cluster systems are clearly prone to failures. Even if cover is provided with some probability c, there would be reconfiguration and/or rebooting delays to resume the operation following a failure. In this paper, the performability modelling of both typical and highly available Beowulf multiprocessor systems is presented. The models developed provide a large degree of flexibility to evaluate the performability of typical and highly available Beowulf cluster systems

    Analytical modelling and simulation of small scale, typical and highly available Beowulf clusters with breakdowns and repairs

    No full text
    Beowulf clusters are very popular because of the high computational power they can provide at reasonably low costs. However, the most pressing issues of today’s cluster solutions are the need for high availability and performance. Cluster systems are clearly prone to failures. Even if cover is provided with some probability c, there would be reconfiguration and/or rebooting delays to resume the operation following a failure. In this paper, the performability modelling of both typical and highly available Beowulf multiprocessor systems is presented. The models developed provide a large degree of flexibility to evaluate the performability of typical and highly available Beowulf cluster systems
    corecore