239 research outputs found

    Resilience-Building Technologies: State of Knowledge -- ReSIST NoE Deliverable D12

    Get PDF
    This document is the first product of work package WP2, "Resilience-building and -scaling technologies", in the programme of jointly executed research (JER) of the ReSIST Network of Excellenc

    Integration of analysis techniques in security and fault-tolerance

    Get PDF
    This thesis focuses on the study of integration of formal methodologies in security protocol analysis and fault-tolerance analysis. The research is developed in two different directions: interdisciplinary and intra-disciplinary. In the former, we look for a beneficial interaction between strategies of analysis in security protocols and fault-tolerance; in the latter, we search for connections among different approaches of analysis within the security area. In the following we summarize the main results of the research

    Intrusion-Tolerant Middleware: the MAFTIA approach

    Get PDF
    The pervasive interconnection of systems all over the world has given computer services a significant socio-economic value, which can be affected both by accidental faults and by malicious activity. It would be appealing to address both problems in a seamless manner, through a common approach to security and dependability. This is the proposal of intrusion tolerance, where it is assumed that systems remain to some extent faulty and/or vulnerable and subject to attacks that can be successful, the idea being to ensure that the overall system nevertheless remains secure and operational. In this paper, we report some of the advances made in the European project MAFTIA, namely in what concerns a basis of concepts unifying security and dependability, and a modular and versatile architecture, featuring several intrusion-tolerant middleware building blocks. We describe new architectural constructs and algorithmic strategies, such as: the use of trusted components at several levels of abstraction; new randomization techniques; new replica control and access control algorithms. The paper concludes by exemplifying the construction of intrusion-tolerant applications on the MAFTIA middleware, through a transaction support servic

    Robust collaborative services interactions under system crashes and network failures

    Get PDF
    Electronic collaboration has grown significantly in the last decade, with applications in many different areas such as shopping, trading, and logistics. Often electronic collaboration is based on automated business processes managed by different companies and connected through the Internet. Such a business process is normally deployed on a process engine, which is a piece of software that is able to execute the business process with the help of infrastructure services (operating system, database, network service, etc.). With the possibility of system crashes and network failures, the design of robust interactions for collaborative processes is a challenge. System crashes and network failures are common events, which may happen in various information systems, e.g., servers, desktops, mobile devices. Business processes use messages to synchronize their state. If a process changes its state, it sends a message to its peer processes in the collaboration to inform them about this change. System crashes and network failures may result in loss of messages. In this case, the state change is performed by some but not all processes, resulting in global state/behavior inconsistencies and possibly deadlocks. In general, a state inconsistency is not automatically detected and recovered by the process engine. Recovery in this case often has to be performed manually after checking execution traces, which is potentially slow, error prone and expensive. Existing solutions either shift the burden to business process developers or require additional infrastructure services support. For example, fault handling approaches require that the developers are aware of possible failures and their recovery strategies. Transaction approaches require a coordinator and coordination protocols deployed in the infrastructure layer. Our idea to solve this problem is to replace each original process by a robust counterpart, which is obtained from the original process through an automatic transformation, before deployment on the process engine. The robust process is deployed with the same infrastructure services and automatically recovers from message loss and state inconsistencies caused by system crashes and network failures. In other words, the robust processes are transparent to developers while leaving the infrastructure unmodified. We assume a synchronous interaction scenario for collaborative processes. With this scenario, an initiator sends a request message to a responder, and waits for a response message, while a responder receives the request message, applies some state change and sends the response messages. With our proposed transformation we obtain robust processes, where each process in the responder role caches the response message if its state has changed by the previously received request message. The possible state inconsistencies are recognized by using timers and information provided by the infrastructure, and resolved by using cached state and by retrying failed interactions. We also considered more complex interaction scenarios with multiple initiator and responder instances (1-n, n-1 and n-n client-server configurations). We have provided a formal proof of the correctness of our transformation solution. We have also done a performance analysis and determined the overhead of the generated (robust) processes compared to the original processes. Since this overhead is low compared to the performance differences that exist as a consequence of using different process engines, we argue that the generated robust processes have applicability in real life business environments. By doing this work, we have learnt the possible failure situations that affect the global state/behavior of collaborative business processes. Furthermore, we have defined transformations for deriving robust processes that are capable of surviving the identified failures

    Fault Management For Service-Oriented Systems

    Get PDF
    Service Oriented Architectures (SOAs) enable the automatic creation of business applications from independently developed and deployed Web services. As Web services are inherently unreliable, how to deliver reliable Web services composition over unreliable Web services is a significant and challenging problem. The process requires monitoring the system\u27s behavior, determining when and why faults occur, and then applying fault prevention/recovery mechanisms to minimize the impact and/or recover from these faults. However, it is hard to apply a non-distributed management approach to SOA, since a manager needs to communicate with the different components through authentications. In SOA, a business process can terminate successfully if all services finish their work correctly through providing alternative plans in case of fault. However, the business process itself may encounter different faults because the fault may occur anywhere at any time due to SOA specifications. In this work, we propose new fault management technique (FLEX) and we identify several improvements over existing techniques. First, existing techniques rely mainly on static information while FLEX is based on dynamic information. Second, existing frameworks use a limited number of attributes; while we use all possible attributes by identify them as either required or optional. Third, FLEX reduces the comparison cost (time and space) by filtering out services at each level needed for evaluation. In general, FLEX is divided into two phases: Phase I, computes service reliability and utility, while in Phase II, runtime planning and evaluation. In Phase I, we assess the fault likelihood of the service using a combination of techniques (e.g., Hidden Marcov Model, Reputation, and Clustering). In Phase II, we build a recovery plan to execute in case of fault(s) and we calculate the overall system reliability based on the fault occurrence likelihoods assessed for all the services that are part of the current composition. FLEX is novel because it relies on key activities of the autonomic control loop (i.e., collect, analyze, decide, plan, and execute) to support dynamic management based on the changes of user requirements and QoS level. Our technique dynamically evaluates the performance of Web services based on their previous history and user requirements, assess the likelihood of fault occurrence, and uses the result to create (multiple) recovery plans. Moreover, we define a method to assess the overall system reliability by evaluating the performance of individual recovery plans, when invoked together. The Experiment results show that our technique improves the service selection quality by selecting the services with the highest score and improves the overall system performance in comparison with existing works. In the future, we plan to investigate techniques for monitoring service oriented systems and assess the online negotiation possibilities for combining different services to create high performance systems

    Extension d'un cadre de composition de comportements en présence de pannes à l'aide de techniques de reprise et de AKKA

    Get PDF
    Abstract: Fault tolerance is an essential property to be satis ed in the composition of services, but reaching a high level of fault tolerance remains a challenge. In the area of ubiquitous computing, the composition of services is inevitable when a request cannot be carried out by a single service, but by a combination of several services. This thesis studies fault tolerance in the context of a general behavior composition framework. This approach raises, rst, the problem of the synthesis of controllers (or compositions) in order to coordinate a set of available services to achieve a new service, the target service and, second, the exploitation of all compositions to make the new service fault tolerant. Although a solution has been proposed by the authors of the behavior composition framework, it is incomplete and has not been evaluated experimentally or in situ. This thesis brings two contributions to this problem. On one hand, it considers the case in which the service selected by the controller is temporarily or permanently unavailable by exploiting recovery techniques to identify a consistent state of the system from which it may progress using other services or leave it in a coherent state when none of the available services no longer allows progression. On the other hand, it evaluates several recovery solutions, each useful in services malfunction situations, using a case study implemented with the aid of Akka, a tool that facilitates the development of reactive, concurrent and distributed systems.La tolérance aux fautes est une propriété indispensable à satisfaire dans la composition de services, mais atteindre un haut de niveau de tolérance aux fautes représente un défi majeur. Dans l'ère de l'informatique ubiquitaire, la composition de services est inévitable lorsqu'une requête ne peut être réalisée par un seul service, mais par la combinaison de plusieurs services. Ce mémoire étudie la tolérance aux fautes dans le contexte d'un cadre général de composition de comportements (behavior composition framework en anglais). Cette approche soulève, tout d'abord, le problème de la synthèse de contrôleurs (ou compositions) de façon à coordonner un ensemble de services disponibles afin de réaliser un nouveau service, le service cible et, ensuite, celui de l'exploitation de l'ensemble des compositions afin de rendre le nouveau service tolérant aux fautes. Bien qu'une solution ait été proposée par les auteurs de ce cadre de composition, elle est incomplète et elle n'a pas été évaluée expérimentalement ou in situ. Ce mémoire apporte deux contributions à ce problème. D'une part, il considère le cas dans lequel le service visé par le contrôleur est temporairement ou définitivement non disponible en exploitant des techniques de reprise afin d'identifier un état cohérent du système à partir duquel il peut progresser en utilisant d'autres services ou de le laisser dans un état cohérent lorsqu'aucun service, parmi ceux disponibles, ne permet plus de progression. D'autre part, il évalue plusieurs solutions de reprise, chacune utile dans des situations particulières de pannes, à l'aide d'une étude de cas implémentée en Akka, un outil qui permet aisément de mettre en oeuvre des systèmes réactifs, concurrents et répartis
    corecore