12 research outputs found

    RAS Modeling of a Large InfiniBand Switch System

    Get PDF

    Model-based sensitivity analysis of IaaS cloud availability

    Get PDF
    The increasing shift of various critical services towards Infrastructure-as-a-Service (IaaS) cloud data centers (CDCs) creates a need for analyzing CDCs’ availability, which is affected by various factors including repair policy and system parameters. This paper aims to apply analytical modeling and sensitivity analysis techniques to investigate the impact of these factors on the availability of a large-scale IaaS CDC, which (1) consists of active and two kinds of standby physical machines (PMs), (2) allows PM moving among active and two kinds of standby PM pools, and (3) allows active and two kinds of standby PMs to have different mean repair times. Two repair policies are considered: (P1) all pools share a repair station and (P2) each pool uses its own repair station. We develop monolithic availability models for each repair policy by using Stochastic Reward Nets and also develop the corresponding scalable two-level models in order to overcome the monolithic model''s limitations, caused by the large-scale feature of a CDC and the complicated interactions among CDC components. We also explore how to apply differential sensitivity analysis technique to conduct parametric sensitivity analysis in the case of interacting sub-models. Numerical results of monolithic models and simulation results are used to verify the approximate accuracy of interacting sub-models, which are further applied to examine the sensitivity of the large-scale CDC availability with respect to repair policy and system parameters

    Modeling and analysis of high availability techniques in a virtualized system

    Get PDF
    Availability evaluation of a virtualized system is critical to the wide deployment of cloud computing services. Time-based, prediction-based rejuvenation of virtual machines (VM) and virtual machine monitors, VM failover and live VM migration are common high-availability (HA) techniques in a virtualized system. This paper investigates the effect of combination of these availability techniques on VM availability in a virtualized system where various software and hardware failures may occur. For each combination, we construct analytic models rejuvenation mechanisms to improve VM availability; (2) prediction-based rejuvenation enhances VM availability much more than time-based VM rejuvenation when prediction successful probability is above 70%, regardless failover and/or live VM migration is also deployed; (3) failover mechanism outperforms live VM migration, although they can work together for higher availability of VM. In addition, they can combine with software rejuvenation mechanisms for even higher availability; (4) and time interval setting is critical to a time-based rejuvenation mechanism. These analytic results provide guidelines for deploying and parameter setting of HA techniques in a virtualized system

    Availability modeling and evaluation on high performance cluster computing systems

    Get PDF
    Cluster computing has been attracting more and more attention from both the industrial and the academic world for its enormous computing power, cost effective, and scalability. Beowulf type cluster, for example, is a typical High Performance Computing (HPC) cluster system. Availability, as a key attribute of the system, needs to be considered at the system design stage and monitored at mission time. Moreover, system monitoring is a must to help identify the defects and ensure the system\u27s availability requirement. In this study, novel solutions which provide availability modeling, model evaluation, and data analysis as a single framework have been investigated. Three key components in the investigation are availability modeling, model evaluation, and data analysis. The general availability concepts and modeling techniques are briefly reviewed. The system\u27s availability model is divided into submodels based upon their functionalities. Furthermore, an object oriented Markov model specification to facilitate availability modeling and runtime configuration has been developed. Numerical solutions for Markov models are examined, especially on the uniformization method. Alternative implementations of the method are discussed; particularly on analyzing the cost of an alternative solution for small state space model, and different ways for solving large sparse Markov models. The dissertation also presents a monitoring and data analysis framework, which is responsible for failure analysis and availability reconfiguration. In addition, the event logs provided from the Lawrence Livermore National Laboratory have been studied and applied to validate the proposed techniques

    Methodologies synthesis

    Get PDF
    This deliverable deals with the modelling and analysis of interdependencies between critical infrastructures, focussing attention on two interdependent infrastructures studied in the context of CRUTIAL: the electric power infrastructure and the information infrastructures supporting management, control and maintenance functionality. The main objectives are: 1) investigate the main challenges to be addressed for the analysis and modelling of interdependencies, 2) review the modelling methodologies and tools that can be used to address these challenges and support the evaluation of the impact of interdependencies on the dependability and resilience of the service delivered to the users, and 3) present the preliminary directions investigated so far by the CRUTIAL consortium for describing and modelling interdependencies

    ACCIDENT ANALYSIS, RISK AND RELIABILITY MODELING OF MARINE TRANSPORTATION SYSTEMS

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Methodology for automated Petri Net model generation to support Reliability Modelling

    Get PDF
    As the complexity of engineering systems and processes increases, determining their optimal performance also becomes increasingly complex. There are various reliability methods available to model performance but generating the models can become a significant task that is cumbersome, error-prone and tedious. Hence, over the years, work has been undertaken into automatically generating reliability models in order to detect the most critical components and design errors at an early stage, supporting alternative designs. Earlier work lacks full automation resulting in semi-automated methods since they require user intervention to import system information to the algorithm, focus on specific domains and cannot accurately model systems or processes with control loops and dynamic features. This thesis develops a novel method that can generate reliability models for complex systems and processes, based on Petri Net models. The process has been fully automated with software developed that extracts the information required for the model from a topology diagram that describes the system or process considered and generates the corresponding mathematical and graphical representations of the Petri Net model. Such topology diagrams are used in industrial sectors, ranging from aerospace and automotive engineering to finance, defence, government, entertainment and telecommunications. Complex real-life scenarios are studied to demonstrate the application of the proposed method, followed by the verification, validation and simulation of the developed Petri Net models. Thus, the proposed method is seen to be a powerful tool to automatically obtain the PN modelling formalism from a topology diagram, commonly used in industry, by: - Handling and efficiently modelling systems and processes with a large number of components and activities respectively, dependent events and control loops. - Providing generic domain applicability. - Providing software independence by generating models readily understandable by the user without requiring further manipulation by any industrial software. Finally, the method documented in this thesis enables engineers to conduct reliability and performance analysis in a timely manner that ensures the results feed into the design process
    corecore