14 research outputs found

    Enhanced Failure Detection Mechanism in MapReduce

    Get PDF
    The popularity of MapReduce programming model has increased interest in the research community for its improvement. Among the other directions, the point of fault tolerance, concretely the failure detection issue seems to be a crucial one, but that until now has not reached its satisfying level. Motivated by this, I decided to devote my main research during this period into having a prototype system architecture of MapReduce framework with a new failure detection service, containing both analytical (theoretical) and implementation part. I am confident that this work should lead the way for further contributions in detecting failures to any NoSQL App frameworks, and cloud storage systems in general

    The eventual leadership in dynamic mobile networking environments

    Get PDF
    2007-2008 > Academic research: refereed > Refereed conference paperVersion of RecordPublishe

    The notion of Timed Registers and its application to Indulgent Synchronization

    Get PDF
    A new type of shared object, called timed register, is proposed and used to design indulgent timing-based algorithms.A timed register generalizes the notion of an atomic register as follows: if a process invokes two consecutive operations on the same timed register which are a read followed by a write, then the write operation is executed only if it is invoked at most d time units after the read operation, where d is defined as part of the read operation. In this context, a timing-based algorithm is an algorithm whose correctness relies on the existence of a bound Δ\Delta such that any pair of consecutive constrained read and write operations issued by the same process on the same timed register are separated by at most Δ\Delta time units. An indulgent algorithm is an algorithm that always guarantees the safety properties, and ensures the liveness property as soon as the timing assumptions are satisfied. The usefulness of this new type of shared object is demonstrated by presenting simple and elegant indulgent timing-based algorithms that solve the mutual exclusion, \ell-exclusion, adaptive renaming, test&set, and consensus problems. Interestingly, timed registers are universal objects in systems with process crashes and transient timing failures (i.e., they allow building any concurrent object with a sequential specification). The paper also suggests connections with schedulers and contention managers

    A Timing Assumption and two tt-Resilient Protocols for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems

    Get PDF
    This paper considers the problem of electing an eventual leader in an asynchronous shared memory system. While this problem has received a lot of attention in message-passing systems, very few solutions have been proposed for shared memory systems. As an eventual leader cannot be elected in a pure asynchronous system prone to process crashes, the paper first proposes to enrich the asynchronous system model with an additional assumption. That assumption (denoted AWB\mathit{AWB}) is particularly weak. It is made up of two complementary parts. More precisely, it requires that, after some time, (1) there is a process whose write accesses to some shared variables be timely, and (2) the timers of (tf)(t-f) other processes be asymptotically well-behaved (tt denotes the maximal number of processes that may crash, and ff the actual number of process crashes in a run). The {\it asymptotically well-behaved} timer notion is a new notion that generalizes and weakens the traditional notion of timers whose durations are required to monotonically increase when the values they are set to increase (a timer works incorrectly when it expires at arbitrary times, i.e., independently of the value it has been set to). The paper then focuses on the design of tt-resilient AWB\mathit{AWB}-based eventual leader protocols. ``tt-resilient'' means that each protocol can cope with up to tt process crashes (taking t=n1t=n-1 provides wait-free protocols, i.e., protocols that can cope with any number of process failures). Two protocols are presented. The first enjoys the following noteworthy properties: after some time only the elected leader has to write the shared memory, and all but one shared variables have a bounded domain, be the execution finite or infinite. This protocol is consequently optimal with respect to the number of processes that have to write the shared memory. The second protocol guarantees that all the shared variables have a bounded domain. This is obtained at the following additional price: all the processes are required to forever write the shared memory. A theorem is proved which states that this price has to be paid by any protocol that elects an eventual leader in a bounded shared memory model. This second protocol is consequently optimal with respect to the number of processes that have to write in such a constrained memory model. In a very interesting way, these protocols show an inherent tradeoff relating the number of processes that have to write the shared memory and the bounded/unbounded attribute of that memory

    Kaatumisilmaisimet hajautetuissa järjestelmissä

    Get PDF
    Pilvipalvelut ovat jo jonkin aikaa olleet keskeinen kaupallisesti hyödynnettävä palvelumuoto internetissä. Tästä ovat tuttuina esimerkkeinä vaikkapa internetin lukuisat hakukoneet sekä erilaiset sähköpostitiliin liitetyt sähköisen materiaalin säilytyspalvelut, kuten esimerkiksi Google Drive. Pilvipalveluiden taustalla olevana arkkitehtuurina ovat hajautetut järjestelmät. Voidaankin sanoa, että pilvipalvelut syntyivät, kun hajautettuja järjestelmiä ryhdyttiin hyödyntämään kaupallisesti. Tässä tutkielmassa perehdytään hajautettujen järjestelmien solmujen kaatumisen kontrollointiin eli kaatumisilmaisimiin hajautetuissa järjestelmissä. Lisäksi perehdytään kahteen keskeiseen kaatumisilmaisimiin liittyvään asiaan. Nämä ovat hajautetun järjestelmän rakenne ja viestintä hajautetussa järjestelmässä

    The Failure Detector Abstraction

    Get PDF
    A failure detector is a fundamental abstraction in distributed computing. This paper surveys this abstraction through two dimensions. First we study failure detectors as building blocks to simplify the design of reliable distributed algorithms. In particular, we illustrate how failure detectors can factor out timing assumptions to detect failures in distributed agreement algorithms. Second, we study failure detectors as computability benchmarks. That is, we survey the weakest failure detector question and illustrate how failure detectors can be used to classify problems. We also highlight some limitations of the failure detector abstraction along each of the dimensions

    Solving key design issues for massively multiplayer online games on peer-to-peer architectures

    Get PDF
    Massively Multiplayer Online Games (MMOGs) are increasing in both popularity and scale on the Internet and are predominantly implemented by Client/Server architectures. While such a classical approach to distributed system design offers many benefits, it suffers from significant technical and commercial drawbacks, primarily reliability and scalability costs. This realisation has sparked recent research interest in adapting MMOGs to Peer-to-Peer (P2P) architectures. This thesis identifies six key design issues to be addressed by P2P MMOGs, namely interest management, event dissemination, task sharing, state persistency, cheating mitigation, and incentive mechanisms. Design alternatives for each issue are systematically compared, and their interrelationships discussed. How well representative P2P MMOG architectures fulfil the design criteria is also evaluated. It is argued that although P2P MMOG architectures are developing rapidly, their support for task sharing and incentive mechanisms still need to be improved. The design of a novel framework for P2P MMOGs, Mediator, is presented. It employs a self-organising super-peer network over a P2P overlay infrastructure, and addresses the six design issues in an integrated system. The Mediator framework is extensible, as it supports flexible policy plug-ins and can accommodate the introduction of new superpeer roles. Key components of this framework have been implemented and evaluated with a simulated P2P MMOG. As the Mediator framework relies on super-peers for computational and administrative tasks, membership management is crucial, e.g. to allow the system to recover from super-peer failures. A new technology for this, namely Membership-Aware Multicast with Bushiness Optimisation (MAMBO), has been designed, implemented and evaluated. It reuses the communication structure of a tree-based application-level multicast to track group membership efficiently. Evaluation of a demonstration application shows i that MAMBO is able to quickly detect and handle peers joining and leaving. Compared to a conventional supervision architecture, MAMBO is more scalable, and yet incurs less communication overheads. Besides MMOGs, MAMBO is suitable for other P2P applications, such as collaborative computing and multimedia streaming. This thesis also presents the design, implementation and evaluation of a novel task mapping infrastructure for heterogeneous P2P environments, Deadline-Driven Auctions (DDA). DDA is primarily designed to support NPC host allocation in P2P MMOGs, and specifically in the Mediator framework. However, it can also support the sharing of computational and interactive tasks with various deadlines in general P2P applications. Experimental and analytical results demonstrate that DDA efficiently allocates computing resources for large numbers of real-time NPC tasks in a simulated P2P MMOG with approximately 1000 players. Furthermore, DDA supports gaming interactivity by keeping the communication latency among NPC hosts and ordinary players low. It also supports flexible matchmaking policies, and can motivate application participants to contribute resources to the system

    ITVM: uma abordagem para tolerância a intrusão utilizando máquinas virtuais

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Engenharia de Automação e SistemasA Internet tem sido o meio de comunicação utilizado por muitas empresas para divulgarem seus serviços. Dessa forma a segurança de tais serviços é um assunto de grande importância e necessidade. Como os sistemas atualmente são distribuídos pela Internet, estes estão expostos a um meio extremamente hostil, onde intrusões por entidades mal intencionadas acontecem com grande frequência. Estas intrusões causam perdas e danos muitas vezes em proporções catastróficas, tanto para empresas como para a sociedade. Nesta dissertação apresentamos uma infraestrutura cuja finalidade é fornecer suporte de tolerância a intrusões (faltas maliciosas ou bizantinas) para serviços. Nossa abordagem faz uso da tecnologia de virtualização e de memória compartilhada no sentido de conseguir custos mais baixos na execução de protocolos de máquina de estado (ME) em contexto bizantino. O uso da virtualização permitiu a inserção de um componente confiável no modelo. Como esta tecnologia separa em diferentes camadas os processos que gerenciam as máquinas virtuais e a que executam as máquinas virtuais é possível a inserção de um componente que gerencie as réplicas, iniciando ou desligando estas quando necessário. Além disso, este componente confiável é responsável pela execução do serviço de acordo do protocolo, e permitiu a criação de uma arquitetura com custos reduzidos, onde as requisições de clientes podem ser executadas com um número variável de réplicas (entre f +1 e 2f+1). O trabalho discute os algoritmos, apresentam detalhes de protótipo, testes e um confronto com trabalhos relacionados na literatura, onde mostramos que foi obtida uma redução do número de servidores necessários, assim como reduzimos o número de réplicas necessárias para a execução do serviço replicado.The Internet has been a means of communication of daily basis for many companies share their service. Presently, the security of such services is a matter of great importance and necessity. These systems are now distributed over the Internet and they are exposed to an extremely hostile environment, where intrusions by malicious entities occur with great frequency. These intrusions cause damages, often catastrophic, both for companies and for society. This work presents an infrastructure based in virtualization which provides support to intrusion tolerance (Byzantine or malicious faults) to services. The introduced approach makes extensive use of virtualization technology and shared memory in order to reduce costs in the execution of state machine (ME) protocols in the Byzantine context. The use of virtualization technology allowed the insertion of a trusted component that in the model. As this technology separates into different layers, processes that manage the virtual machines and process that execute virtual machines, it is possible to insert a component that manages the replicas, starting or turning off these when necessary. In addition, this reliable component is responsible for the execution of agreement service, and allowed the creation of an architecture with reduced costs, where the customer requests can be performed with a variable number of replicas (between f+1 and 2f + 1). This work discusses the algorithms, presents details of a prototype, tests and a comparison with related work in the literature
    corecore