255 research outputs found

    Modelling network memory servers with parallel processors, break-downs and repairs.

    Get PDF
    This paper presents an analytical method for the performability evaluation of a previously reported network memory server attached to a local area network. To increase the performance and availability of the proposed system, an additional server is added to the system. Such systems are prone to failures. With this in mind, a mathematical model has been developed to analyse the performability of the proposed system with break-downs and repairs. Mean queue lengths and the probability of job losses for the LAN feeding the network memory server is calculated and presented

    Modelling network memory servers with parallel processors, break-downs and repairs.

    Get PDF
    This paper presents an analytical method for the performability evaluation of a previously reported network memory server attached to a local area network. To increase the performance and availability of the proposed system, an additional server is added to the system. Such systems are prone to failures. With this in mind, a mathematical model has been developed to analyse the performability of the proposed system with break-downs and repairs. Mean queue lengths and the probability of job losses for the LAN feeding the network memory server is calculated and presented

    The Performability Manager

    Get PDF
    The authors describe the performability manager, a distributed system component that contributes to a more effective and efficient use of system components and prevents quality of service (QoS) degradation. The performability manager dynamically reconfigures distributed systems whenever needed, to recover from failures and to permit the system to evolve over time and include new functionality. Large systems require dynamic reconfiguration to support dynamic change without shutting down the complete system. A distributed system monitor is needed to verify QoS. Monitoring a distributed system is difficult because of synchronization problems and minor differences in clock speeds. The authors describe the functionality and the operation of the performability manager (both informally and formally). Throughout the paper they illustrate the approach by an example distributed application: an ANSAware-based number translation service (NTS), from the intelligent networks (IN) area

    Performability modelling of homogenous and heterogeneous multiserver systems with breakdowns and repairs

    Get PDF
    This thesis presents analytical modelling of homogeneous multi-server systems with reconfiguration and rebooting delays, heterogeneous multi-server systems with one main and several identical servers, and farm paradigm multi-server systems. This thesis also includes a number of other research works such as, fast performability evaluation models of open networks of nodes with repairs and finite queuing capacities, multi-server systems with deferred repairs, and two stage tandem networks with failures, repairs and multiple servers at the second stage. Applications of these for the popular Beowulf cluster systems and memory servers are also accomplished. Existing techniques used in performance evaluation of multi-server systems are investigated and analysed in detail. Pure performance modelling techniques, pure availability models, and performability models are also considered. First, the existing approaches for pure performance modelling are critically analysed with the discussions on merits and demerits. Then relevant terminology is defined and explained. Since the pure performance models tend to be too optimistic and pure availability models are too conservative, performability models are used for the evaluation of multi-server systems. Fault-tolerant multi-server systems can continue service in case of certain failures. If failure does not occur at a critical point (such as breakdown of the head processor of a farm paradigm system) the system continues serving in a degraded mode of operation. In such systems, reconfiguration and/or rebooting delays are expected while a processor is being mapped out from the system. These delay stages are also taken into account in addition to failures and repairs, in the exact performability models that are developed. Two dimensional Markov state space representations of the systems are used for performability modelling. Following the critical analysis of the existing solution techniques, the Spectral Expansion method is chosen for the solution of the models developed. In this work, open queuing networks are also considered. To evaluate their performability, existing modelling approaches are expanded and validated by simulations, for performability analysis of multistage open networks with finite queuing capacities. The performances of two extended modelling approaches are compared in terms of accuracy for open networks with various queuing capacities. Deferred repair strategies are becoming popular because of the cost reductions they can provide. Effects of using deferred repairs are analysed and performability models are provided for homogeneous multi-server systems and highly available farm paradigm multi-server systems. Since one of the random variables is used to represent the number of jobs in one of the queues, analytical models for performance evaluation of two stage tandem networks suffer because of numerical cumbersomeness. Existing approaches for modelling these systems are actually pure performance models since breakdowns and repairs cannot be considered. One way of modelling these systems can be to divide one of the random variables to present both the operative and non-operative states of the server in one dimension. However, this will give rise to state explosion problem severely limiting the maximum queue capacity that can be handled. In order to overcome this problem a new approach is presented for modelling two stage tandem networks in three dimensions. An approximate solution is presented to solve such a system. This approach manifests itself as a novel contribution for alleviating the state space explosion problem for large and/or complex systems. When two state tandem networks with feedback are modelled using this approach, the operative states can be handled independently and this makes it possible to consider multiple operative states at the second stage. The analytical models presented can be used with various parameters and they are extendible to consider systems with similar architectures. The developed three dimensional approach is capable to handle two stage tandem networks with various characteristics for performability measures. All the approaches presented give accurate results. Numerical solutions are presented for all models developed. In case the solution presented is not exact, simulations are performed to validate the accuracy of the results obtained

    Software Performance Engineering for Cloud Applications – A Survey

    Get PDF
    Cloud computing enables application service providers to lease their computing capabilities for deploying applications depending on user QoS (Quality of Service) requirements.Cloud applications have different composition, configuration and deployment requirements.Quantifying the performance of applications in Cloud computing environments is a challenging task. Software performance engineering(SPE) techniques enable us to assess performance requirements of software applications at the early stages of development. This assessment helps the developers to fine tune their design needs so that the targeted performance goals can be met. In this paper, we try to analyseperformance related issues of cloud applications and identify any SPE techniques currently available for cloud applications

    Exploring gate-limited analytical models for high-performance network storage servers

    Get PDF
    Gate-limited service is a type of service discipline found in queueing theory and can be used to describe a number of operational environments, for example, large transport systems such as buses, trains or taxis, etc. Recently, there has been the observation that such systems can also be used to describe interactive Internet Services which use a Client/Server interaction. In addition, new services of this genre are being developed for the local area. One such service is a Network Memory Server (NMS) being developed here at Middlesex University. Though there are several examples of real systems that can be modelled using gate-limited service, it is fair to say that the analytical models which have been developed for gate-limited systems have been difficult to use, requiring many iterations before practical results can be generated. In this paper, a detailed gate-limited bulk service queueing model based on Markov chains is explored and a numerical solution is demonstrated for simple scenarios. Quantitative results are presented and compared with a mathematical simulation. The analysis is used to develop an algorithm based on the concept of optimum operational points. The algorithm is then employed to build a high-performance server which is capable of balancing the need to prefetch for streaming applications while promptly satisfying demand misses. The algorithm is further tested using a systems simulation and then incorporated into an Experimental File System (EFS) which showed that the algorithm can be used in a real networking environment

    Experimental analysis of computer system dependability

    Get PDF
    This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance

    Availability modeling and evaluation on high performance cluster computing systems

    Get PDF
    Cluster computing has been attracting more and more attention from both the industrial and the academic world for its enormous computing power, cost effective, and scalability. Beowulf type cluster, for example, is a typical High Performance Computing (HPC) cluster system. Availability, as a key attribute of the system, needs to be considered at the system design stage and monitored at mission time. Moreover, system monitoring is a must to help identify the defects and ensure the system\u27s availability requirement. In this study, novel solutions which provide availability modeling, model evaluation, and data analysis as a single framework have been investigated. Three key components in the investigation are availability modeling, model evaluation, and data analysis. The general availability concepts and modeling techniques are briefly reviewed. The system\u27s availability model is divided into submodels based upon their functionalities. Furthermore, an object oriented Markov model specification to facilitate availability modeling and runtime configuration has been developed. Numerical solutions for Markov models are examined, especially on the uniformization method. Alternative implementations of the method are discussed; particularly on analyzing the cost of an alternative solution for small state space model, and different ways for solving large sparse Markov models. The dissertation also presents a monitoring and data analysis framework, which is responsible for failure analysis and availability reconfiguration. In addition, the event logs provided from the Lawrence Livermore National Laboratory have been studied and applied to validate the proposed techniques
    • …
    corecore