4 research outputs found

    Design and implementation of a coordination service for distributed applications (In-memory Paxos)

    Get PDF
    Coordination of different, independent processes is a very important aspect in the area of distributed systems. In order to coordinate each other, participants of a distributed system often have to agree on some common knowledge such as locking of a shared resource. The general problem how to reach agreement on some value is also known as consensus problem. In many practical systems, the consensus problem is outsourced to a distributed system consisting of multiple servers to increase its availability. Each server can be contacted by clients that intend to reach consensus about a specific value. Examples are Google's locking service Chubby or Yahoo's distributed file system Zookeeper. The standard Paxos algorithm solves this problem in an environment where nodes may recover after a crash and messages can have infinite delay. However, a system based on the classical Paxos algorithm makes use of expensive stable storage operations to guarantee that a crashed and recovered Paxos server is still able to participate in the protocol. Studies have shown that these disk costs are the bottleneck of the whole system. In this work a performance-oriented version of Paxos will be investigated that still solves the consensus problem, but trades availability of the consensus system against performance by not using stable storage operations. Without careful design this can be problematic, because a Paxos server that has lost its memory can be dangerous for the success of the protocol. This can be solved by not allowing a crashed and recovered Paxos server to participate in the protocol anymore. Instead, a recovered server rejoins the protocol with a new id, so that no active processes assume anything about the recovered process. In order to join the group of active processes, a majority of servers has to be active and agree on this. Our evaluations show, that a coordination system based on the in-memory Paxos approach has a very short response time of only one millisecond and high throughput up to 18000 write requests per second

    Control-plane consistency in software-defined networking: distributed controller synchronization using the ISIS² toolkit

    Get PDF
    Software-defined Networking (SDN) is a recent approach in computer networks to ease the network administration by separating the control-plane and the data-plane. The data-plane only forwards packets according to certain rules specified by the control-plane. The control-plane, implemented by a software called controller, determines the forwarding rules based on a global view of the network. In order to increase fault tolerance and to eliminate a possible performance bottleneck, the controller can be distributed. The synchronization of the data that holds the global view is conventionally realized using distributed key-value stores offering a fixed consistency semantic, not respecting the heterogeneous consistency requirements of the data items in controller state. The virtual synchrony model, an alternative approach to the commonly used state machine replication method, offers a more flexible solution that can result in higher performance when certain assumptions on the data kept in controller state can be made. In this thesis a distributed controller based on OpenDaylight, a state-of-the-art SDN controller and the ISIS² library, that implements the virtual synchrony model, is proposed. The modular architecture of the proposed controller and the usage of a platform independent data model allows to extend or replace parts of the system. The implementation of the distributed controller is described and the macro and micro performance is evaluated with benchmarks.Software-defined Networking (SDN) ist ein aktueller Ansatz zu Computernetzwerken, der die Netzwerkadministration vereinfacht, in dem die Kontrollschicht von der Weiterleitungsschicht getrennt wird. Die Weiterleitungsschicht ist nur für das Weiterleiten von Paketen nach Regeln zuständig, die von der Kontrollschicht festgelegt werden. Die Kontrollschicht, die von einer Controller genannten Software implementiert wird, legt die Weiterleitungsregeln, basierend auf einer globalen Sicht auf das Netzwerk, fest. Um die Ausfallsicherheit zu erhöhen und um einen möglichen Leitungsengpass zu eliminieren kann der Controller verteilt werden. Die Synchronisation zwischen den Controllern wird herkömmlicherweise mithilfe von verteilten Key-Value Stores realisiert, die nur eine feste Konsistenzeigenschaft anbieten, was die heterogenen Konsistenzansprüche der Daten im Controllerzustand nicht berücksichtigt. Das Virtual Synchrony Modell, ein alternativer Ansatz zu der üblichen State-Machine Replication Methode, bietet eine flexiblere Lösung die zu höherer Leistung führen kann, wenn bestimmte Annahmen über die Daten im Controllerzustand gemacht werden können. Diese Arbeit stellt einen verteilten Controller basierend auf OpenDaylight, einem aktuellen SDN Controller und ISIS², einer Bibliothek die das Virtual Synchrony Modell umsetzt, vor. Die modulare Architektur des vorgestellten Controllers und die Verwendung eines plattformunabhänigen Datenmodells erlauben es, das System zu erweitern oder Komponenten zu ersetzen. Die Implementierung des verteilten Controllers wird beschrieben und die Komponenten und Gesamtleistung wird durch Benchmark-Tests ausgewertet

    Replicated execution of workflows

    Get PDF
    Workflows are the de facto standard for managing and optimizing business processes. Workflows allow businesses to automate interactions between business locations and partners residing anywhere on the planet. This, however, requires the workflows to be executed in a distributed and dynamic environment, where device and communication failures occur quite frequently. In case that a workflow execution becomes unavailable through such failures, the business operations that rely on the workflow might be hindered or even stopped, implying the loss of money. Consequently, availability is a key concern when using workflows in dynamic environments. In this thesis, we propose replication schemes for workflow engines to ensure the availability of the workflows that are executed by these engines. Of course, a workflow that is executed by a replicated workflow engine has to yield the same result as a non-replicated execution of that workflow. To this end, we formally define the equivalence of a replicated and a non-replicated execution called Single-Execution-Equivalence. Subsequently, we present replication schemes for both imperative and declarative workflow languages. Imperative workflow languages, such as the Web Service Business Process Execution Language (WS-BPEL), specify the execution order of activities through an ordering relation and are the predominant way of specifying workflow models. We implement a proof-of-concept for demonstrating the compatibility of our replication schemes with current (imperative) workflow technology. Declarative workflow languages provide greater flexibility by allowing the reordering of the activities within a workflow at run-time. We exploit this by executing differently ordered replicas on several nodes in the network for improving availability further

    Contributions au rendement des protocoles de diffusion à ordre total et aux réseaux tolérants aux délais à base de RFID

    Get PDF
    Dans les systèmes répartis asynchrones, l'horloge logique et le vecteur d'horloges sont deux outils fondamentaux pour gérer la communication et le partage de données entre les entités constitutives de ces systèmes. L'objectif de cette thèse est d'exploiter ces outils avec une perspective d'implantation. Dans une première partie, nous nous concentrons sur la communication de données et contribuons au domaine de la diffusion uniforme à ordre total. Nous proposons le protocole des trains : des jetons (appelés trains) circulent en parallèle entre les processus participants répartis sur un anneau virtuel. Chaque train est équipé d'une horloge logique utilisée pour retrouver les train(s) perdu(s) en cas de défaillance de processus. Nous prouvons que le protocole des trains est un protocole de diffusion uniforme à ordre total. Puis, nous créons une nouvelle métrique : le rendement en termes de débit. Cette métrique nous permet de montrer que le protocole des trains a un rendement supérieur au meilleur, en termes de débit, des protocoles présentés dans la littérature. Par ailleurs, cette métrique fournit une limite théorique du débit maximum atteignable en implantant un protocole de diffusion donné. Il est ainsi possible d'évaluer la qualité d'une implantation de protocole. Les performances en termes de débit du protocole des trains, notamment pour les messages de petites tailles, en font un candidat remarquable pour le partage de données entre coeurs d'un même processeur. De plus, sa sobriété en termes de surcoût réseau en font un candidat privilégié pour la réplication de données entre serveurs dans le cloud. Une partie de ces travaux a été implantée dans un système de contrôle-commande et de supervision déployé sur plusieurs dizaines de sites industriels. Dans une seconde partie, nous nous concentrons sur le partage de données et contribuons au domaine de la RFID. Nous proposons une mémoire répartie partagée basée sur des étiquettes RFID. Cette mémoire permet de s'affranchir d'un réseau informatique global. Pour ce faire, elle s'appuie sur des vecteurs d'horloges et exploite le réseau formé par les utilisateurs mobiles de l'application répartie. Ainsi, ces derniers peuvent lire le contenu d'étiquettes RFID distantes. Notre mémoire répartie partagée à base de RFID apporte une alternative aux trois architectures à base de RFID disponibles dans la littérature. Notre mémoire répartie partagée a été implantée dans un jeu pervasif qui a été expérimenté par un millier de personnes.In asynchronous distributed systems, logical clock and vector clocks are two core tools to manage data communication and data sharing between entities of these systems. The goal of this PhD thesis is to exploit these tools with a coding viewpoint. In the first part of this thesis, we focus on data communication and contribute to the total order broadcast domain. We propose trains protocol: Tokens (called trains) rotate in parallel between participating processes distributed on a virtual ring. Each train contains a logical clock to recover lost train(s) in case of process(es) failure. We prove that trains protocol is a uniform and totally ordered broadcast protocol. Afterwards, we create a new metric: the throughput efficiency. With this metric, we are able to prove that, from a throughput point of view, trains protocol performs better than protocols presented in literature. Moreover, this metric gives the maximal theoretical throughput which can be reached when coding a given protocol. Thus, it is possible to evaluate the quality of the coding of a protocol. Thanks to its throughput performances, in particular for small messages, trains protocol is a remarkable candidate for data sharing between the cores of a processor. Moreover, thanks to its temperance concerning network usage, it can be worthwhile for data replication between servers in the cloud. Part of this work was implemented inside a control-command and supervision system deployed among several dozens of industrial sites. In the second part of this thesis, we focus on data sharing and contribute to RFID domain. We propose a distributed shared memory based on RFID tags. Thanks to this memory, we can avoid installing a computerized global network. This is possible because this memory uses vector clocks and relies on the network made by the mobile users of the distributed application. Thus, the users are able to read the contents of remote RFID tags. Our RFID-based distributed shared memory is an alternative to the three RFID-based architectures available in the literature. This distributed shared memory was implemented in a pervasive game tested by one thousand users.PARIS-CNAM (751032301) / SudocSudocFranceF
    corecore