766 research outputs found

    MACHS: Mitigating the Achilles Heel of the Cloud through High Availability and Performance-aware Solutions

    Get PDF
    Cloud computing is continuously growing as a business model for hosting information and communication technology applications. However, many concerns arise regarding the quality of service (QoS) offered by the cloud. One major challenge is the high availability (HA) of cloud-based applications. The key to achieving availability requirements is to develop an approach that is immune to cloud failures while minimizing the service level agreement (SLA) violations. To this end, this thesis addresses the HA of cloud-based applications from different perspectives. First, the thesis proposes a component’s HA-ware scheduler (CHASE) to manage the deployments of carrier-grade cloud applications while maximizing their HA and satisfying the QoS requirements. Second, a Stochastic Petri Net (SPN) model is proposed to capture the stochastic characteristics of cloud services and quantify the expected availability offered by an application deployment. The SPN model is then associated with an extensible policy-driven cloud scoring system that integrates other cloud challenges (i.e. green and cost concerns) with HA objectives. The proposed HA-aware solutions are extended to include a live virtual machine migration model that provides a trade-off between the migration time and the downtime while maintaining HA objective. Furthermore, the thesis proposes a generic input template for cloud simulators, GITS, to facilitate the creation of cloud scenarios while ensuring reusability, simplicity, and portability. Finally, an availability-aware CloudSim extension, ACE, is proposed. ACE extends CloudSim simulator with failure injection, computational paths, repair, failover, load balancing, and other availability-based modules

    A Coloured Petri Nets Based Attack Tolerance Framework

    Get PDF
    International audienceWeb services provide a general basis of convenient access and operation for cloud applications. However, such services become very vulnerable when being attacked, especially in the situation where service continuity is one of the most important requirements. This issue highlights the necessity to apply reliable and formal methods to attack tolerance in Web services. In this paper, we propose a Coloured Petri Nets based method for attack tolerance by modelling and analysing basic behaviours of attack-network interaction, attack detectors and their tolerance solutions. Furthermore, complex attacks can be analysed and tolerance solutions deployed by identifying these basic attack-network interactions and composing their solutions. The validity of our method is demonstrated through a case study on attack tolerance in cloud-based medical information storage

    Reliable Composite Web Services Execution: Towards a Dynamic Recovery Decision

    Get PDF
    AbstractDuring the execution of a Composite Web Service (CWS), different faults may occur that cause WSs failures. There exist strategies that can be applied to repair these failures, such as: WS retry, WS substitution, compensation, roll-back, or replication. Each strategy has advantages and disadvantages on different execution scenarios and can produce different impact on the CWS QoS. Hence, it is important to define a dynamic fault tolerant strategy which takes into account environment and execution information to accordingly decide the appropriate recovery strategy. We present a preliminary study in order to analyze the impact on the CWS total execution time of different recovery strategies on different scenarios. The experimental results show that under different conditions, recovery strategies behave differently. This analysis represents a first step towards the definition of a model to dynamically decide which recovery strategy is the best choice by taking into account the context-information when the failure occurs

    Perfomance Analysis and Resource Optimisation of Critical Systems Modelled by Petri Nets

    Get PDF
    Un sistema crítico debe cumplir con su misión a pesar de la presencia de problemas de seguridad. Este tipo de sistemas se suele desplegar en entornos heterogéneos, donde pueden ser objeto de intentos de intrusión, robo de información confidencial u otro tipo de ataques. Los sistemas, en general, tienen que ser rediseñados después de que ocurra un incidente de seguridad, lo que puede conducir a consecuencias graves, como el enorme costo de reimplementar o reprogramar todo el sistema, así como las posibles pérdidas económicas. Así, la seguridad ha de ser concebida como una parte integral del desarrollo de sistemas y como una necesidad singular de lo que el sistema debe realizar (es decir, un requisito no funcional del sistema). Así pues, al diseñar sistemas críticos es fundamental estudiar los ataques que se pueden producir y planificar cómo reaccionar frente a ellos, con el fin de mantener el cumplimiento de requerimientos funcionales y no funcionales del sistema. A pesar de que los problemas de seguridad se consideren, también es necesario tener en cuenta los costes incurridos para garantizar un determinado nivel de seguridad en sistemas críticos. De hecho, los costes de seguridad puede ser un factor muy relevante ya que puede abarcar diferentes dimensiones, como el presupuesto, el rendimiento y la fiabilidad. Muchos de estos sistemas críticos que incorporan técnicas de tolerancia a fallos (sistemas FT) para hacer frente a las cuestiones de seguridad son sistemas complejos, que utilizan recursos que pueden estar comprometidos (es decir, pueden fallar) por la activación de los fallos y/o errores provocados por posibles ataques. Estos sistemas pueden ser modelados como sistemas de eventos discretos donde los recursos son compartidos, también llamados sistemas de asignación de recursos. Esta tesis se centra en los sistemas FT con recursos compartidos modelados mediante redes de Petri (Petri nets, PN). Estos sistemas son generalmente tan grandes que el cálculo exacto de su rendimiento se convierte en una tarea de cálculo muy compleja, debido al problema de la explosión del espacio de estados. Como resultado de ello, una tarea que requiere una exploración exhaustiva en el espacio de estados es incomputable (en un plazo prudencial) para sistemas grandes. Las principales aportaciones de esta tesis son tres. Primero, se ofrecen diferentes modelos, usando el Lenguaje Unificado de Modelado (Unified Modelling Language, UML) y las redes de Petri, que ayudan a incorporar las cuestiones de seguridad y tolerancia a fallos en primer plano durante la fase de diseño de los sistemas, permitiendo así, por ejemplo, el análisis del compromiso entre seguridad y rendimiento. En segundo lugar, se proporcionan varios algoritmos para calcular el rendimiento (también bajo condiciones de fallo) mediante el cálculo de cotas de rendimiento superiores, evitando así el problema de la explosión del espacio de estados. Por último, se proporcionan algoritmos para calcular cómo compensar la degradación de rendimiento que se produce ante una situación inesperada en un sistema con tolerancia a fallos

    Extended Fault Taxonomy of SOA-Based Systems

    Get PDF
    Service Oriented Architecture (SOA) is considered as a standard for enterprise software development. The main characteristics of SOA are dynamic discovery and composition of software services in a heterogeneous environment. These properties pose newer challenges in fault management of SOA-based systems (SBS). A proper understanding of different faults in an SBS is very necessary for effective fault handling. A comprehensive three-fold fault taxonomy is presented here that covers distributed, SOA specific and non-functional faults in a holistic manner. A comprehensive fault taxonomy is a key starting point for providing techniques and methods for accessing the quality of a given system. In this paper, an attempt has been made to outline several SBSs faults into a well-structured taxonomy that may assist developers to plan suitable fault repairing strategies. Some commonly emphasized fault recovery strategies are also discussed. Some challenges that may occur during fault handling of SBSs are also mentioned

    5G Multi-access Edge Computing: Security, Dependability, and Performance

    Full text link
    The main innovation of the Fifth Generation (5G) of mobile networks is the ability to provide novel services with new and stricter requirements. One of the technologies that enable the new 5G services is the Multi-access Edge Computing (MEC). MEC is a system composed of multiple devices with computing and storage capabilities that are deployed at the edge of the network, i.e., close to the end users. MEC reduces latency and enables contextual information and real-time awareness of the local environment. MEC also allows cloud offloading and the reduction of traffic congestion. Performance is not the only requirement that the new 5G services have. New mission-critical applications also require high security and dependability. These three aspects (security, dependability, and performance) are rarely addressed together. This survey fills this gap and presents 5G MEC by addressing all these three aspects. First, we overview the background knowledge on MEC by referring to the current standardization efforts. Second, we individually present each aspect by introducing the related taxonomy (important for the not expert on the aspect), the state of the art, and the challenges on 5G MEC. Finally, we discuss the challenges of jointly addressing the three aspects.Comment: 33 pages, 11 figures, 15 tables. This paper is under review at IEEE Communications Surveys & Tutorials. Copyright IEEE 202

    A Specification Language for Performance and Economical Analysis of Short Term Data Intensive Energy Management Services

    Get PDF
    Requirements of Energy Management Services include short and long term processing of data in a massively interconnected scenario. The complexity and variety of short term applications needs methodologies that allow designers to reason about the models taking into account functional and non-functional requirements. In this paper we present a component based specification language for building trustworthy continuous dataflow applications. Component behaviour is defined by Petri Nets in order to translate to the methodology all the advantages derived from a mathematically based executable model to support analysis, verification, simulation and performance evaluation. The paper illustrates how to model and reason with specifications of advanced dataflow abstractions such as smart grids
    corecore