4,787 research outputs found

    Optimal Termination Protocols for Network Partitioning

    Get PDF
    We address the problem of maintaining the distributed database consistency in presence of failures while maximizing the database availability. Network partitioning is a failure which partitions the distributed system into a number of parts, no part being able to communicate with any other. Formalizations of various notions in this context are developed and two measures for the performances of protocols in presence of a network partitioning are introduced. A general optimality theory is developed for two classes of protocols - centralized and decentralized. Optimal protocols are produced in all cases.published_or_final_versio

    Modelling, analysing and model checking commit protocols

    Get PDF

    ShallowForest: Optimizing All-to-All Data Transmission in WANs

    Get PDF
    All-to-all data transmission is a typical data transmission pattern in both consensus protocols and blockchain systems. Developing an optimization scheme that provides high throughput and low latency data transmission can significantly benefit the performance of those systems. This thesis investigates the problem of optimizing all-to-all data transmission in a wide area network (WAN) using overlay multicast. I first prove that in a congestion-free core network model, using shallow tree overlays with height up to two is sufficient for all-to-all data transmission to achieve the optimal throughput allowed by the available network resources. Based on this finding, I build ShallowForest, a data plane optimization for consensus protocols and blockchain systems. The goal of ShallowForest is to improve consensus protocols' resilience to skewed client load distribution. Experiments with skewed client load across replicas in the Amazon cloud demonstrate that ShallowForest can improve the commit throughput of the EPaxos consensus protocol by up to 100% with up to 60% reduction in commit latenc

    A proposal for a coordinated effort for the determination of brainwide neuroanatomical connectivity in model organisms at a mesoscopic scale

    Get PDF
    In this era of complete genomes, our knowledge of neuroanatomical circuitry remains surprisingly sparse. Such knowledge is however critical both for basic and clinical research into brain function. Here we advocate for a concerted effort to fill this gap, through systematic, experimental mapping of neural circuits at a mesoscopic scale of resolution suitable for comprehensive, brain-wide coverage, using injections of tracers or viral vectors. We detail the scientific and medical rationale and briefly review existing knowledge and experimental techniques. We define a set of desiderata, including brain-wide coverage; validated and extensible experimental techniques suitable for standardization and automation; centralized, open access data repository; compatibility with existing resources, and tractability with current informatics technology. We discuss a hypothetical but tractable plan for mouse, additional efforts for the macaque, and technique development for human. We estimate that the mouse connectivity project could be completed within five years with a comparatively modest budget.Comment: 41 page

    Optimizing recovery protocols for replicated database systems

    Full text link
    En la actualidad, el uso de tecnologías de informacíon y sistemas de cómputo tienen una gran influencia en la vida diaria. Dentro de los sistemas informáticos actualmente en uso, son de gran relevancia los sistemas distribuidos por la capacidad que pueden tener para escalar, proporcionar soporte para la tolerancia a fallos y mejorar el desempeño de aplicaciones y proporcionar alta disponibilidad. Los sistemas replicados son un caso especial de los sistemas distribuidos. Esta tesis está centrada en el área de las bases de datos replicadas debido al uso extendido que en el presente se hace de ellas, requiriendo características como: bajos tiempos de respuesta, alto rendimiento en los procesos, balanceo de carga entre las replicas, consistencia e integridad de datos y tolerancia a fallos. En este contexto, el desarrollo de aplicaciones utilizando bases de datos replicadas presenta dificultades que pueden verse atenuadas mediante el uso de servicios de soporte a mas bajo nivel tales como servicios de comunicacion y pertenencia. El uso de los servicios proporcionados por los sistemas de comunicación de grupos permiten ocultar los detalles de las comunicaciones y facilitan el diseño de protocolos de replicación y recuperación. En esta tesis, se presenta un estudio de las alternativas y estrategias empleadas en los protocolos de replicación y recuperación en las bases de datos replicadas. También se revisan diferentes conceptos sobre los sistemas de comunicación de grupos y sincronia virtual. Se caracterizan y clasifican diferentes tipos de protocolos de replicación con respecto a la interacción o soporte que pudieran dar a la recuperación, sin embargo el enfoque se dirige a los protocolos basados en sistemas de comunicación de grupos. Debido a que los sistemas comerciales actuales permiten a los programadores y administradores de sistemas de bases de datos renunciar en alguna medida a la consistencia con la finalidad de aumentar el rendimiento, es importante determinar el nivel de consistencia necesario. En el caso de las bases de datos replicadas la consistencia está muy relacionada con el nivel de aislamiento establecido entre las transacciones. Una de las propuestas centrales de esta tesis es un protocolo de recuperación para un protocolo de replicación basado en certificación. Los protocolos de replicación de base de datos basados en certificación proveen buenas bases para el desarrollo de sus respectivos protocolos de recuperación cuando se utiliza el nivel de aislamiento snapshot. Para tal nivel de aislamiento no se requiere que los readsets sean transferidos entre las réplicas ni revisados en la fase de cetificación y ya que estos protocolos mantienen un histórico de la lista de writesets que es utilizada para certificar las transacciones, este histórico provee la información necesaria para transferir el estado perdido por la réplica en recuperación. Se hace un estudio del rendimiento del protocolo de recuperación básico y de la versión optimizada en la que se compacta la información a transferir. Se presentan los resultados obtenidos en las pruebas de la implementación del protocolo de recuperación en el middleware de soporte. La segunda propuesta esta basada en aplicar el principio de compactación de la informacion de recuperación en un protocolo de recuperación para los protocolos de replicación basados en votación débil. El objetivo es minimizar el tiempo necesario para transfeir y aplicar la información perdida por la réplica en recuperación obteniendo con esto un protocolo de recuperación mas eficiente. Se ha verificado el buen desempeño de este algoritmo a través de una simulación. Para efectuar la simulación se ha hecho uso del entorno de simulación Omnet++. En los resultados de los experimentos puede apreciarse que este protocolo de recuperación tiene buenos resultados en múltiples escenarios. Finalmente, se presenta la verificación de la corrección de ambos algoritmos de recuperación en el Capítulo 5.Nowadays, information technology and computing systems have a great relevance on our lives. Among current computer systems, distributed systems are one of the most important because of their scalability, fault tolerance, performance improvements and high availability. Replicated systems are a specific case of distributed system. This Ph.D. thesis is centered in the replicated database field due to their extended usage, requiring among other properties: low response times, high throughput, load balancing among replicas, data consistency, data integrity and fault tolerance. In this scope, the development of applications that use replicated databases raises some problems that can be reduced using other fault-tolerant building blocks, as group communication and membership services. Thus, the usage of the services provided by group communication systems (GCS) hides several communication details, simplifying the design of replication and recovery protocols. This Ph.D. thesis surveys the alternatives and strategies being used in the replication and recovery protocols for database replication systems. It also summarizes different concepts about group communication systems and virtual synchrony. As a result, the thesis provides a classification of database replication protocols according to their support to (and interaction with) recovery protocols, always assuming that both kinds of protocol rely on a GCS. Since current commercial DBMSs allow that programmers and database administrators sacrifice consistency with the aim of improving performance, it is important to select the appropriate level of consistency. Regarding (replicated) databases, consistency is strongly related to the isolation levels being assigned to transactions. One of the main proposals of this thesis is a recovery protocol for a replication protocol based on certification. Certification-based database replication protocols provide a good basis for the development of their recovery strategies when a snapshot isolation level is assumed. In that level readsets are not needed in the validation step. As a result, they do not need to be transmitted to other replicas. Additionally, these protocols hold a writeset list that is used in the certification/validation step. That list maintains the set of writesets needed by the recovery protocol. This thesis evaluates the performance of a recovery protocol based on the writeset list tranfer (basic protocol) and of an optimized version that compacts the information to be transferred. The second proposal applies the compaction principle to a recovery protocol designed for weak-voting replication protocols. Its aim is to minimize the time needed for transferring and applying the writesets lost by the recovering replica, obtaining in this way an efficient recovery. The performance of this recovery algorithm has been checked implementing a simulator. To this end, the Omnet++ simulating framework has been used. The simulation results confirm that this recovery protocol provides good results in multiple scenarios. Finally, the correction of both recovery protocols is also justified and presented in Chapter 5.García Muñoz, LH. (2013). Optimizing recovery protocols for replicated database systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/31632TESI

    Cost- and workload-driven data management in the cloud

    Get PDF
    This thesis deals with the challenge of finding the right balance between consistency, availability, latency and costs, captured by the CAP/PACELC trade-offs, in the context of distributed data management in the Cloud. At the core of this work, cost and workload-driven data management protocols, called CCQ protocols, are developed. First, this includes the development of C3, which is an adaptive consistency protocol that is able to adjust consistency at runtime by considering consistency and inconsistency costs. Second, the development of Cumulus, an adaptive data partitioning protocol, that can adapt partitions by considering the application workload so that expensive distributed transactions are minimized or avoided. And third, the development of QuAD, a quorum-based replication protocol, that constructs the quorums in such a way so that, given a set of constraints, the best possible performance is achieved. The behavior of each CCQ protocol is steered by a cost model, which aims at reducing the costs and overhead for providing the desired data management guarantees. The CCQ protocols are able to continuously assess their behavior, and if necessary to adapt the behavior at runtime based on application workload and the cost model. This property is crucial for applications deployed in the Cloud, as they are characterized by a highly dynamic workload, and high scalability and availability demands. The dynamic adaptation of the behavior at runtime does not come for free, and may generate considerable overhead that might outweigh the gain of adaptation. The CCQ cost models incorporate a control mechanism, which aims at avoiding expensive and unnecessary adaptations, which do not provide any benefits to applications. The adaptation is a distributed activity that requires coordination between the sites in a distributed database system. The CCQ protocols implement safe online adaptation approaches, which exploit the properties of 2PC and 2PL to ensure that all sites behave in accordance with the cost model, even in the presence of arbitrary failures. It is crucial to guarantee a globally consistent view of the behavior, as in contrary the effects of the cost models are nullified. The presented protocols are implemented as part of a prototypical database system. Their modular architecture allows for a seamless extension of the optimization capabilities at any level of their implementation. Finally, the protocols are quantitatively evaluated in a series of experiments executed in a real Cloud environment. The results show their feasibility and ability to reduce application costs, and to dynamically adjust the behavior at runtime without violating their correctness
    corecore