115 research outputs found

    Building global and scalable systems with atomic multicast

    Get PDF
    The rise of worldwide Internet-scale services demands large distributed systems. Indeed, when handling several millions of users, it is common to operate thousands of servers spread across the globe. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B-trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure. Here, replication plays a central role, as it contributes to improve the user experience by hiding failures and by providing acceptable latency. In this thesis, we claim that atomic multicast, with strong and well-defined properties, is the appropriate abstraction to efficiently design and implement globally scalable distributed systems. Internet-scale services rely on data partitioning and replication to provide scalable performance and high availability. Moreover, to reduce user-perceived response times and tolerate disasters (i.e., the failure of a whole datacenter), services are increasingly becoming geographically distributed. Data partitioning and replication, combined with local and geographical distribution, introduce daunting challenges, including the need to carefully order requests among replicas and partitions. One way to tackle this problem is to use group communication primitives that encapsulate order requirements. While replication is a common technique used to design such reliable distributed systems, to cope with the requirements of modern cloud based ``always-on'' applications, replication protocols must additionally allow for throughput scalability and dynamic reconfiguration, that is, on-demand replacement or provisioning of system resources. We propose a dynamic atomic multicast protocol which fulfills these requirements. It allows to dynamically add and remove resources to an online replicated state machine and to recover crashed processes. Major efforts have been spent in recent years to improve the performance, scalability and reliability of distributed systems. In order to hide the complexity of designing distributed applications, many proposals provide efficient high-level communication abstractions. Since the implementation of a production-ready system based on this abstraction is still a major task, we further propose to expose our protocol to developers in the form of distributed data structures. B- trees for example, are commonly used in different kinds of applications, including database indexes or file systems. Providing a distributed, fault-tolerant and scalable data structure would help developers to integrate their applications in a distribution transparent manner. This work describes how to build reliable and scalable distributed systems based on atomic multicast and demonstrates their capabilities by an implementation of a distributed ordered map that supports dynamic re-partitioning and fast recovery. To substantiate our claim, we ported an existing SQL database atop of our distributed lock-free data structure

    Continuum: an architecture for user evolvable collaborative virtual environments

    Get PDF
    Continuum is a software platform for collaborative virtual environments. Continuum\u27s architecture supplies a world model and defines how to combine object state, behavior code, and resource data into this single shared structure. The system frees distributed users from the constraints of monolithic centralized virtual world architectures and instead allows individual users to extend and evolve the virtual world by creating and controlling their own individual pieces of the larger world model. The architecture provides support for data distribution, code management, resource management, and rapid deployment through standardized viewers. This work not only provides this architecture, but it includes a proven implementation and the associated development tools to allow for creation of these worlds

    Interprocess communication in highly distributed systems

    Get PDF
    Issued as Final technical report, Project no. G-36-632Final technical report has title: Interprocess communication in highly distributed system

    Transaction Chains: Achieving Serializability with Low Latency in Geo-distributed Storage Systems. In:

    Get PDF
    Abstract Currently, users of geo-distributed storage systems face a hard choice between having serializable transactions with high latency, or limited or no transactions with low latency. We show that it is possible to obtain both serializable transactions and low latency, under two conditions. First, transactions are known ahead of time, permitting an a priori static analysis of conflicts. Second, transactions are structured as transaction chains consisting of a sequence of hops, each hop modifying data at one server. To demonstrate this idea, we built Lynx, a geo-distributed storage system that offers transaction chains, secondary indexes, materialized join views, and geo-replication. Lynx uses static analysis to determine if each hop can execute separately while preserving serializability-if so, a client needs wait only for the first hop to complete, which occurs quickly. To evaluate Lynx, we built three applications: an auction service, a Twitter-like microblogging site and a social networking site. These applications successfully use chains to achieve low latency operation and good throughput

    MegSDF Mega-system development framework

    Get PDF
    A framework for developing large, complex software systems, called Mega-Systems, is specified. The framework incorporates engineering, managerial, and technological aspects of development, concentrating on an engineering process. MegSDF proposes developing Mega-Systems as open distributed systems, pre-planned to be integrated with other systems, and designed for change. At the management level, MegSDF divides the development of a Mega-System into multiple coordinated projects, distinguishing between a meta-management for the whole development effort, responsible for long-term, global objectives, and local managements for the smaller projects, responsible for local, temporary objectives. At the engineering level, MegSDF defines a process model which specifies the tasks required for developing Mega-Systems, including their deliverables and interrelationships. The engineering process emphasizes the coordination required to develop the constituent systems. The process is active for the life time of the Mega-System and compatible with different approaches for performing its tasks. The engineering process consists of System, Mega-System, Mega-System Synthesis, and Meta-Management tasks. System tasks develop constituent systems. Mega-Systems tasks provide a means for engineering coordination, including Domain Analysis, Mega-System Architecture Design. and Infrastructure Acquisition tasks. Mega-System Synthesis tasks assemble Mega-Systems from the constituent systems. The Meta-Management task plans and controls the entire process. The domain analysis task provides a general, comprehensive, non-constructive domain model, which is used as a common basis for understanding the domain. MegSDF builds the domain model by integrating multiple significant perceptions of the domain. It recommends using a domain modeling schema to facilitate modeling and integrating the multiple perceptions. The Mega-System architecture design task specifies a conceptual architecture and an application architecture. The conceptual architecture specifies common design and implementation concepts and is defined using multiple views. The application architecture maps the domain model into an implementation and defines the overall structure of the Mega-System, its boundaries, components, and interfaces. The infrastructure acquisition task addresses the technological aspects of development. It is responsible for choosing, developing or purchasing, validating, and supporting an infrastructure. The infrastructure integrates the enabling technologies into a unified platform which is used as a common solution for handling technologies. The infrastructure facilitates portability of systems and incorporation of new technologies. It is implemented as a set of services, divided into separate service groups which correspond to the views identified in the conceptual architecture

    Reassembling Knowledge Translation Through a Case of Autism Genomics: Multiplicity and Coordination Amidst Practiced Actor-Networks

    Get PDF
    Knowledge translation (KT) has become a ubiquitous and important component within the Canadian health research funding environment. Despite a large and burgeoning literature on the topic of KT, research on the science of KT spans a very narrow philosophical spectrum, with published studies almost exclusively positioned within positivism. Grounded in a constructionist philosophical position and influenced by actor-network theory, this dissertation aims to contribute to the Canadian KT discussion by imagining new possibilities for conceptualizing KT. This is an empirical-theoretical study which is based on eight months of data collection, including interviews, participant observation, and document analysis. This data collection took place in a basic science laboratory, a clinic, and amongst families involved in genomic research pertaining to Autism Spectrum Disorder in a Canadian city. Interviews were transcribed verbatim and organization of the data was aided by QSR Nvivo software. Theoretical insights put forward in this dissertation are based on a detailed description of the everyday, local, micro-dynamics of knowledge translation within a particular case study of an autism genomics project. Through data collection I have followed the practices of a laboratory, clinic, and family homes through which genomic knowledge was assembled and re-assembled. Through the exploration of the practices of scientists, clinicians, and families involved in an autism genetics study, I examine the concepts of multiplicity, difference, and coordination. I argue that autism is practiced differently, through different technologies and assessments, in the laboratory, clinic, and home. This dissertation closes with a new framework for and model of the knowledge translation process called the Local Translations of Knowledge in Practice model. I argue that expanding the range of theoretical and philosophical positions attended to in KT research will contribute to a richer understanding of the KT process and move forward the Canadian KT agenda. Ethics approval for this research was obtained from The University of Western Ontario and from the hospital in which the data was gathered

    High Performance Computing for DNA Sequence Alignment and Assembly

    Get PDF
    Recent advances in DNA sequencing technology have dramatically increased the scale and scope of DNA sequencing. These data are used for a wide variety of important biological analyzes, including genome sequencing, comparative genomics, transcriptome analysis, and personalized medicine but are complicated by the volume and complexity of the data involved. Given the massive size of these datasets, computational biology must draw on the advances of high performance computing. Two fundamental computations in computational biology are read alignment and genome assembly. Read alignment maps short DNA sequences to a reference genome to discover conserved and polymorphic regions of the genome. Genome assembly computes the sequence of a genome from many short DNA sequences. Both computations benefit from recent advances in high performance computing to efficiently process the huge datasets involved, including using highly parallel graphics processing units (GPUs) as high performance desktop processors, and using the MapReduce framework coupled with cloud computing to parallelize computation to large compute grids. This dissertation demonstrates how these technologies can be used to accelerate these computations by orders of magnitude, and have the potential to make otherwise infeasible computations practical

    Ontology of music performance variation

    Get PDF
    Performance variation in rhythm determines the extent that humans perceive and feel the effect of rhythmic pulsation and music in general. In many cases, these rhythmic variations can be linked to percussive performance. Such percussive performance variations are often absent in current percussive rhythmic models. The purpose of this thesis is to present an interactive computer model, called the PD-103, that simulates the micro-variations in human percussive performance. This thesis makes three main contributions to existing knowledge: firstly, by formalising a new method for modelling percussive performance; secondly, by developing a new compositional software tool called the PD-103 that models human percussive performance, and finally, by creating a portfolio of different musical styles to demonstrate the capabilities of the software. A large database of recorded samples are classified into zones based upon the vibrational characteristics of the instruments, to model timbral variation in human percussive performance. The degree of timbral variation is governed by principles of biomechanics and human percussive performance. A fuzzy logic algorithm is applied to analyse current and first-order sample selection in order to formulate an ontological description of music performance variation. Asynchrony values were extracted from recorded performances of three different performance skill levels to create \timing fingerprints" which characterise unique features to each percussionist. The PD-103 uses real performance timing data to determine asynchrony values for each synthesised note. The spectral content of the sample database forms a three-dimensional loudness/timbre space, intersecting instrumental behaviour with music composition. The reparameterisation of the sample database, following the analysis of loudness, spectral flatness, and spectral centroid, provides an opportunity to explore the timbral variations inherent in percussion instruments, to creatively explore dimensions of timbre. The PD-103 was used to create a music portfolio exploring different rhythmic possibilities with a focus on meso-periodic rhythms common to parts of West Africa, jazz drumming, and electroacoustic music. The portfolio also includes new timbral percussive works based on spectral features and demonstrates the central aim of this thesis, which is the creation of a new compositional software tool that integrates human percussive performance and subsequently extends this model to different genres of music
    corecore