    Efficient TCP Connection Failover in Web Server Clusters

    Abstract — Web clusters continue to be widely used by large enterprises and organizations to host online services. Providing services without interruption is critical to the revenue and perceived image of both hosts and content providers. Therefore, server node failure and recovery should be invisible to the clients. Most of the existing fault-tolerance schemes simply stop dispatching future client requests to the failed server. They do not recover those connections handled by the node at the time of failure, which makes the failure visible to some clients. Making the failure transparent requires both application-layer and transport-layer mechanisms. While atomic application-layer primary-backup failover schemes have been addressed at length in previous literature, a transport-layer scheme is necessary in order to make them invisible to the clients. In this paper we describe a transparent TCP connection failover mechanism. Besides transparency, our solution is also highly efficient, and does not need any dedicated hardware support. I

    Availability modeling and evaluation of web-based services - A pragmatic approach

    Cette thèse porte sur le développement d’une approche de modélisation pragmatique permettant aux concepteurs d’applications et systèmes mis en oeuvre sur le web d’évaluer la disponibilité du service fourni aux utilisateurs. Plusieurs sources d’indisponibilité du service sont prises en compte, en particulier i) les défaillances matérielles ou logicielles affectant les serveurs et ii) des dégradations de performance (surcharge des serveurs, temps de réponse trop long, etc.). Une approche hiérarchique multi-niveau basée sur une modélisation de type performabilité est proposée, combinant des chaînes de Markov et des modèles de files d’attente. Les principaux concepts et la faisabilité de cette approche sont illustrés à travers l’exemple d’une agence de voyage. Plusieurs modèles analytiques et études de sensibilité sont présentés en considérant différentes hypothèses concernant l’architecture, les stratégies de recouvrement, les fautes, les profils d’utilisateurs, et les caractéristiques du trafic. ABSTRACT : This thesis presents a pragmatic modeling approach allowing designers of web-based applications and systems to evaluate the service availability provided to the users. Multiple sources of service unavailability are taken into account, in particular i) hardware and software failures affecting the servers, and ii) performance degradation (overload of servers, very long response time, etc.). An hierarchical multi-level approach is proposed based on performability modeling, combining Markov chains and queueing models. The main concepts and the feasibility of this approach are illustrated using a web-based travel agency. Various analytical models and sensitivity studies are presented considering different assumptions with respect to the architectures, recovery strategies, faults, users profile and traffic characteristics

    Latency-driven replication for globally distributed systems

    Steen, M.R. van [Promotor]Pierre, G.E.O. [Copromotor