22 research outputs found

    Performance et fiabilité des protocoles de tolérance aux fautes

    Get PDF
    In the modern era of on-demand ubiquitous computing, where applications and services are deployed in well-provisioned, well-managed infrastructures, administered by large groups of cloud providers such as Amazon, Google, Microsoft, Oracle, etc., performance and dependability of the systems have become primary objectives.Cloud computing has evolved from questioning the Quality-of-Service (QoS) making factors such as availability, reliability, liveness, safety and security, extremely necessary in the complete definition of a system. Indeed, computing systems must be resilient in the presence of failures and attacks to prevent their inaccessibility which can lead to expensive maintenance costs and loss of business. With the growing components in cloud systems, faults occur more commonly resulting in frequent cloud outages and failing to guarantee the QoS. Cloud providers have seen episodic incidents of arbitrary (i.e., Byzantine) faults where systems demonstrate unpredictable conducts, which includes incorrect response of a client's request, sending corrupt messages, intentional delaying of messages, disobeying the ordering of the requests, etc.This has led researchers to extensively study Byzantine Fault Tolerance (BFT) and propose numerous protocols and software prototypes. These BFT solutions not only provide consistent and available services despite arbitrary failures, they also intend to reduce the cost and performance overhead incurred by the underlying systems. However, BFT prototypes have been evaluated in ad-hoc settings, considering either ideal conditions or very limited faulty scenarios. This fails to convince the practitioners for the adoption of BFT protocols in a distributed system. Some argue on the applicability of expensive and complex BFT to tolerate arbitrary faults while others are skeptical on the adeptness of BFT techniques. This thesis precisely addresses this problem and presents a comprehensive benchmarking environment which eases the setup of execution scenarios to analyze and compare the effectiveness and robustness of these existing BFT proposals.Specifically, contributions of this dissertation are as follows.First, we introduce a generic architecture for benchmarking distributed protocols. This architecture, comprises reusable components for building a benchmark for performance and dependability analysis of distributed protocols. The architecture allows defining workload and faultload, and their injection. It also produces performance, dependability, and low-level system and network statistics. Furthermore, the thesis presents the benefits of a general architecture.Second, we present BFT-Bench, the first BFT benchmark, for analyzing and comparing representative BFT protocols under identical scenarios. BFT-Bench allows end-users evaluate different BFT implementations under user-defined faulty behaviors and varying workloads. It allows automatic deploying these BFT protocols in a distributed setting with ability to perform monitoring and reporting of performance and dependability aspects. In our results, we empirically compare some existing state-of-the-art BFT protocols, in various workloads and fault scenarios with BFT-Bench, demonstrating its effectiveness in practice.Overall, this thesis aims to make BFT benchmarking easy to adopt by developers and end-users of BFT protocols.BFT-Bench framework intends to help users to perform efficient comparisons of competing BFT implementations, and incorporating effective solutions to the detected loopholes in the BFT prototypes. Furthermore, this dissertation strengthens the belief in the need of BFT techniques for ensuring correct and continued progress of distributed systems during critical fault occurrence.A l'ère de l’informatique omniprésente et à la demande, où les applications et les services sont déployés sur des infrastructures bien gérées et approvisionnées par des grands groupes de fournisseurs d’informatique en nuage (Cloud Computing), tels Amazon,Google,Microsoft,Oracle, etc, la performance et la fiabilité de ces systèmes sont devenues des objectifs primordiaux. Cette informatique a rendu particulièrement nécessaire la prise en compte des facteurs de la Qualité de Service (QoS), telles que la disponibilité, la fiabilité, la vivacité, la sureté et la sécurité,dans la définition complète d’un système. En effet, les systèmes informatiques doivent être résistants aussi bien aux défaillances qu’aux attaques et ce, afin d'éviter qu'ils ne deviennent inaccessibles, entrainent des couts de maintenance importants et la perte de parts de marché. L'augmentation de la taille et la complexité des systèmes en nuage rend de plus en plus commun les défauts, augmentant la fréquence des pannes, et n’offrant donc plus la Garantie de Service visée. Les fournisseurs d’informatique en nuage font ainsi face épisodiquement à des fautes arbitraires, dites Byzantines, durant lesquelles les systèmes ont des comportements imprévisibles.Ce constat a amené les chercheurs à s’intéresser de plus en plus à la tolérance aux fautes byzantines (BFT) et à proposer de nombreux prototypes de protocoles et logiciels. Ces solutions de BFT visent non seulement à fournir des services cohérents et continus malgré des défaillances arbitraires, mais cherchent aussi à réduire le coût et l’impact sur les performances des systèmes sous-jacents. Néanmoins les prototypes BFT ont été évalués le plus souvent dans des contextes ad hoc, soit dans des conditions idéales, soit en limitant les scénarios de fautes. C’est pourquoi ces protocoles de BFT n’ont pas réussi à convaincre les professionnels des systèmes distribués de les adopter. Cette thèse entend répondre à ce problème en proposant un environnement complet de banc d’essai dont le but est de faciliter la création de scénarios d'exécution utilisables pour aussi bien analyser que comparer l'efficacité et la robustesse des propositions BFT existantes. Les contributions de cette thèse sont les suivantes :Nous introduisons une architecture générique pour analyser des protocoles distribués. Cette architecture comprend des composants réutilisables permettant la mise en œuvre d’outils de mesure des performances et d’analyse de la fiabilité des protocoles distribués. Cette architecture permet de définir la charge de travail, de défaillance, et l’injection de ces dernières. Elle fournit aussi des statistiques de performance, de fiabilité du système de bas niveau et du réseau. En outre, cette thèse présente les bénéfices d’une architecture générale.Nous présentons BFT-Bench, le premier système de banc d’essai de la BFT, pour l'analyse et la comparaison d’un panel de protocoles BFT utilisés dans des situations identiques. BFT-Bench permet aux utilisateurs d'évaluer des implémentations différentes pour lesquels ils définissent des comportements défaillants avec différentes charges de travail.Il permet de déployer automatiquement les protocoles BFT étudiés dans un environnement distribué et offre la possibilité de suivre et de rendre compte des aspects performance et fiabilité. Parmi nos résultats, nous présentons une comparaison de certains protocoles BFT actuels, réalisée avec BFT-Bench, en définissant différentes charges de travail et différents scénarii de fautes. Cette réelle application de BFT-Bench en démontre l’efficacité.Le logiciel BFT-Bench a été conçu en ce sens pour aider les utilisateurs à comparer efficacement différentes implémentations de BFT et apporter des solutions effectives aux lacunes identifiées des prototypes BFT. De plus, cette thèse défend l’idée que les techniques BFT sont nécessaires pour assurer un fonctionnement continu et correct des systèmes distribués confrontés à des situations critiques

    A Method to Reduce the Cost of Resilience Benchmarking of SelfAdaptive Systems

    Get PDF
    Ensuring the resilience of self-adaptive systems used in critical infrastructure systems is a concern as their failure has severe societal and financial consequences. The current trends in the growth of the scale and complexity of society\u27s workload demands and the systems built to cope with these demands increases the anxiety surrounding service disruptions. Self-adaptive mechanisms instill dynamic behavior to systems in an effort to improve their resilience to runtime changes that would otherwise result in service disruption or failure, such as faults, errors, and attacks. Thus, the evaluation of a self-adaptive system\u27s resilience is critical to ensure expected operational qualities and elicit trust in their services. However, resilience benchmarking is often overlooked or avoided due to the high cost associated with evaluating the runtime behavior of large and complex self-adaptive systems against an almost infinite number of possible runtime changes. Researchers have focused on techniques to reduce the overall costs of benchmarking while ensuring the comprehensiveness of the evaluation as testing costs have been found to account for 50 to 80% of total system costs. These test suite minimization techniques include the removal of irrelevant, redundant, and repetitive test cases to ensure that only relevant tests that adequately elicit the expected system responses are enumerated. However, these approaches require an exhaustive test suite be defined first and then the irrelevant tests are filtered out, potentially negating any cost savings. This dissertation provides a new approach of defining a resilience changeload for self-adaptive systems by incorporating goal-oriented requirements engineering techniques to extract system information and guide the identification of relevant runtime changes. The approach constructs a goal refinement graph consisting of the system\u27s refined goals, runtime actions, self-adaptive agents, and underlying runtime assumptions that is used to identify obstructing conditions to runtime goal attainment. Graph theory is then used to gauge the impact of obstacles on runtime goal attainment and those that exceed the relevance requirement are included in the resilience changeload for enumeration. The use of system knowledge to guide the changeload definition process increased the relevance of the resilience changeload while minimizing the test suite, resulting in a reduction of overall benchmarking costs. Analysis of case study results confirmed that the new approach was more cost effective on the same subject system over previous work. The new approach was shown to reduce the overall costs by 79.65%, increase the relevance of the defined test suite, reduce the amount of wasted effort, and provide a greater return on investment over previous work by a factor of two

    mCrash: a framework for the evaluation of mobile devices trustworthiness properties

    Get PDF
    A rationale and framework for the evaluation of mobile devices’ robustness and trustworthiness properties using a Windows Mobile 5.0 testbed is presented. The methodology followed includes employing software faultinjection techniques at the operating system’s interface level and customising tests to the behaviour of the software

    A reliability benchmark for actor-based server languages

    Get PDF
    Servers are a key element of current IT infrastructures, and must often deal with large numbers of concurrent requests. Reliability is crucial as any disruption is extremely costly. Some important reliable servers are implemented in actor languages/libraries that provide process isolation and supervision. Reliability benchmarks model fault scenarios to measure the reliability characteristics of systems. The paper presents the design and implementation of a new reliability benchmark for actor-based server languages: Supervised Communicating Processes (SCP). SCP extends an existing server concurrency benchmark by supervising server actors/processes. We outline Erlang and Scala/Akka SCP implementations, and an associated fault injector. We compare the reliability characteristics of Erlang and Scala/Akka for server-style computations using SCP in the following four main experiments. (1) Progressive permanent failures, where a percentage of server processes fail permanently. (2) Recovery from different percentages (0% .. 20%) of failures occurring uniformly, randomly, or in bursts, and with a range of supervisor/supervisee ratios. (3) Comparing how the Erlang and Scala/Akka SCPs handle burst, random and uniform failure patterns. (4) Comparing how Erlang and Scala/Akka handle server actor/process faults with different fault patterns and failure rates

    Multi-criteria analysis of measures in benchmarking: Dependability benchmarking as a case study

    Full text link
    This is the author’s version of a work that was accepted for publication in The Journal of Systems and Software. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Multi-criteria analysis of measures in benchmarking: Dependability benchmarking as a case study. Journal of Systems and Software, 111, 2016. DOI 10.1016/j.jss.2015.08.052.Benchmarks enable the comparison of computer-based systems attending to a variable set of criteria, such as dependability, security, performance, cost and/or power consumption. It is not despite its difficulty, but rather its mathematical accuracy that multi-criteria analysis of results remains today a subjective process rarely addressed in an explicit way in existing benchmarks. It is thus not surprising that industrial benchmarks only rely on the use of a reduced set of easy-to-understand measures, specially when considering complex systems. This is a way to keep the process of result interpretation straightforward, unambiguous and accurate. However, it limits at the same time the richness and depth of the analysis process. As a result, the academia prefers to characterize complex systems with a wider set of measures. Marrying the requirements of industry and academia in a single proposal remains a challenge today. This paper addresses this question by reducing the uncertainty of the analysis process using quality (score-based) models. At measure definition time, these models make explicit (i) which are the requirements imposed to each type of measure, that may vary from one context of use to another, and (ii) which is the type, and intensity, of the relation between considered measures. At measure analysis time, they provide a consistent, straightforward and unambiguous method to interpret resulting measures. The methodology and its practical use are illustrated through three different case studies from the dependability benchmarking domain, a domain where various different criteria, including both performance and dependability, are typically considered during analysis of benchmark results.. Although the proposed approach is limited to dependability benchmarks in this document, its usefulness for any type of benchmark seems quite evident attending to the general formulation of the provided solution. © 2015 Elsevier Inc. All rights reserved.This work is partially supported by the Spanish project ARENES (TIN2012-38308-C02-01), ANR French project AMORES (ANR-11-INSE-010), the Intel Doctoral Student Honour Programme 2012, and the "Programa de Ayudas de Investigacion y Desarrollo" (PAID) from the Universitat Politecnica de Valencia.Friginal López, J.; Martínez, M.; De Andrés, D.; Ruiz, J. (2016). Multi-criteria analysis of measures in benchmarking: Dependability benchmarking as a case study. Journal of Systems and Software. 111:105-118. https://doi.org/10.1016/j.jss.2015.08.052S10511811

    DEPENDABILITY BENCHMARKING OF NETWORK FUNCTION VIRTUALIZATION

    Get PDF
    Network Function Virtualization (NFV) is an emerging networking paradigm that aims to reduce costs and time-to-market, improve manageability, and foster competition and innovative services. NFV exploits virtualization and cloud computing technologies to turn physical network functions into Virtualized Network Functions (VNFs), which will be implemented in software, and will run as Virtual Machines (VMs) on commodity hardware located in high-performance data centers, namely Network Function Virtualization Infrastructures (NFVIs). The NFV paradigm relies on cloud computing and virtualization technologies to provide carrier-grade services, i.e., the ability of a service to be highly reliable and available, within fast and automatic failure recovery mechanisms. The availability of many virtualization solutions for NFV poses the question on which virtualization technology should be adopted for NFV, in order to fulfill the requirements described above. Currently, there are limited solutions for analyzing, in quantitative terms, the performance and reliability trade-offs, which are important concerns for the adoption of NFV. This thesis deals with assessment of the reliability and of the performance of NFV systems. It proposes a methodology, which includes context, measures, and faultloads, to conduct dependability benchmarks in NFV, according to the general principles of dependability benchmarking. To this aim, a fault injection framework for the virtualization technologies has been designed and implemented for the virtualized technologies being used as case studies in this thesis. This framework is successfully used to conduct an extensive experimental campaign, where we compare two candidate virtualization technologies for NFV adoption: the commercial, hypervisor-based virtualization platform VMware vSphere, and the open-source, container-based virtualization platform Docker. These technologies are assessed in the context of a high-availability, NFV-oriented IP Multimedia Subsystem (IMS). The analysis of experimental results reveal that i) fault management mechanisms are crucial in NFV, in order to provide accurate failure detection and start the subsequent failover actions, and ii) fault injection proves to be valuable way to introduce uncommon scenarios in the NFVI, which can be fundamental to provide a high reliable service in production

    UML-Based Modeling of Robustness Testing

    Get PDF
    Abstract-The aim of robustness testing is to characterize the behavior of a system in the presence of erroneous or stressful input conditions. It is a well-established approach in the dependability community, which has a long tradition of testing based on fault injection. However, a recurring problem is the insufficient documentation of experiments, which may prevent their replication. Our work investigates whether UMLbased documentation could be used. It proposes an extension of the UML Testing Profile that accounts for the specificities of robustness testing experiments. The extension also reuses some elements of the QoSFT profile targeting measurements. Its ability to model realistic experiments is demonstrated on a case study from dependability research

    An approach for evaluation of efficacy of vulnerability scanning tools in web applications

    Get PDF
    Orientadores: Mário Jino, Regina Lúcia de Oliveira MoraesDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Grande parte das aplicações Web é desenvolvida atualmente sob severas restrições de tempo e custo. A complexidade dos produtos de software é cada vez maior resultando em vulnerabilidades de segurança produzidas por má codificação. Ferramentas chamadas scanners de vulnerabilidade são utilizadas para auxiliar a detecção automática de vulnerabilidades de segurança em aplicações Web; portanto, poder confiar nos resultados da aplicação dessas ferramentas é essencial. Este trabalho propõe uma abordagem para avaliar a eficácia desses scanners. A abordagem proposta está baseada em técnicas de injeção de falhas e modelos de árvores de ataque; os resultados da aplicação de três scanners são avaliados na presença de falhas realistas de software responsáveis por vulnerabilidades de segurança em aplicações Web. As árvores de ataque representam os passos para se realizar um ataque, permitindo verificar se vulnerabilidades detectadas pelo scanner existem de fato na aplicação sob teste. A abordagem também pode ser utilizada para realizar testes de segurança, pois permite a detecção de vulnerabilidades pela execução de cenários de ataqueAbstract: Nowadays, most web applications are developed under strict time and cost constraints. The complexity of software products is increasingly bigger leading to security vulnerabilities due to bad coding. Tools called vulnerability scanners are being applied to automatically detect security vulnerabilities in web applications; thus, trustworthiness of the results of application of these tools is essential. The present work proposes an approach to assess the efficacy of vulnerability scanner tools. The proposed approach is based on fault injection techniques and attack tree models; the results of the application of three scanners are assessed in the presence of realistic software faults responsible for security vulnerabilities in web applications. Attack trees represent the steps of performing an attack, allowing verifying whether security vulnerabilities detected by the scanner tool do exist in the application under test. The approach can also be used to perform security tests, as it permits the detection of vulnerabilities through the execution of attack scenariosMestradoEngenharia de ComputaçãoMestre em Engenharia Elétric
    corecore