2,074 research outputs found

    Client-access protocols for replicated services

    No full text
    Published versio

    LogBase: A Scalable Log-structured Database System in the Cloud

    Full text link
    Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads observed in write-heavy environments and hence adversely affects the write throughput and recovery time in the system. In this paper, we introduce LogBase - a scalable log-structured database system that adopts log-only storage for removing the write bottleneck and supporting fast system recovery. LogBase is designed to be dynamically deployed on commodity clusters to take advantage of elastic scaling property of cloud environments. LogBase provides in-memory multiversion indexes for supporting efficient access to data maintained in the log. LogBase also supports transactions that bundle read and write operations spanning across multiple records. We implemented the proposed system and compared it with HBase and a disk-based log-structured record-oriented system modeled after RAMCloud. The experimental results show that LogBase is able to provide sustained write throughput, efficient data access out of the cache, and effective system recovery.Comment: VLDB201

    Managing Population and Workload Imbalance in Structured Overlays

    Get PDF
    Every day the number of data produced by networked devices increases. The current paradigm is to offload the data produced to data centers to be processed. However as more and more devices are offloading their data do cloud centers, accessing data becomes increasingly more challenging. To combat this problem, systems are bringing data closer to the consumer and distributing network responsibilities among the end devices. We are witnessing a change in networking paradigm, where data storage and computation that was once only handled in the cloud, is being processed by Internet of Things (IoT) and mobile devices, thanks to the ever increasing technological capabilities of these devices. One approach, leverages devices into a structured overlay network. Structured Overlays are a common approach to address the organization and distri- bution of data in peer-to-peer distributed systems. Due to their nature, indexing and searching for elements of the system becomes trivial, thus structured overlays become ideal building blocks of resource location based applications. Such overlays assume that the data is distributed evenly over the peers, and that the popularity of those data items is also evenly balanced. However in many systems, due to many factors outside of the system domain, popularity may behave rather randomly, al- lowing for some nodes to spare more resources looking for the popular items than others. In this work we intend to exploit the properties of cluster-based structured overlays propose to address this problem by improving a structure overlay with the mechanisms to manage the population and workload imbalance and achieve more uniform use of resources. Our approach focus on implementing a Group-Based Distributed Hash Table (DHT) capable of dynamically changing its groups to accommodate the changes in churn in the network. With the conclusion of our work we believe that we have indeed created a network capable of withstanding high levels of churn, while ensuring fairness to all members of the network.Todos os dias aumenta o número de dados produzidos por dispositivos em rede. O pa- radigma atual é descarregar os dados produzidos para centros de dados para serem pro- cessados. No entanto com o aumento do número de dispositivos a descarregar dados para estes centros, o acesso aos dados torna-se cada vez mais desafiante. Para combater este problema, os sistemas estão a aproximar os dados dos consumidores e a distribuir responsabilidades de rede entre os dispositivos. Estamos a assistir a uma mudança no paradigma de redes, onde o armazenamento de dados e a computação que antes eram da responsabilidade dos centros de dados, está a ser processado por dispositivos móveis IoT, graças às crescentes capacidades tecnológicas destes dispositivos. Uma abordagem, junta os dispositivos em redes estruturadas. As redes estruturadas são o meio mais comum de organizar e distribuir dados em redes peer-to-peer. Gradas às suas propriedades, indexar e procurar por elementos torna- se trivial, assim, as redes estruturadas tornam-se o bloco de construção ideal para sistemas de procura de ficheiros. Estas redes assumem que os dados estão distribuídos equitativamente por todos os participantes e que todos esses dados são igualmente procurados. no entanto em muitos sistemas, por factores externos a popularidade tem um comportamento volátil e imprevi- sível sobrecarregando os participantes que guardam os dados mais populares. Este trabalho tenta explorar as propriedades das redes estruturadas em grupo para confrontar o problema, vamos equipar uma destas redes com os mecanismos necessários para coordenar os participantes e a sua carga. A nossa abordagem focasse na implementação de uma DHT baseado em grupos capaz de alterar dinamicamente os grupos para acomodar as mudanças de membros da rede. Com a conclusão de nosso trabalho, acreditamos que criamos uma rede capaz de suportar altos níveis de instabilidade, enquanto garante justiça a todos os membros da rede

    DEPAS: A Decentralized Probabilistic Algorithm for Auto-Scaling

    Full text link
    The dynamic provisioning of virtualized resources offered by cloud computing infrastructures allows applications deployed in a cloud environment to automatically increase and decrease the amount of used resources. This capability is called auto-scaling and its main purpose is to automatically adjust the scale of the system that is running the application to satisfy the varying workload with minimum resource utilization. The need for auto-scaling is particularly important during workload peaks, in which applications may need to scale up to extremely large-scale systems. Both the research community and the main cloud providers have already developed auto-scaling solutions. However, most research solutions are centralized and not suitable for managing large-scale systems, moreover cloud providers' solutions are bound to the limitations of a specific provider in terms of resource prices, availability, reliability, and connectivity. In this paper we propose DEPAS, a decentralized probabilistic auto-scaling algorithm integrated into a P2P architecture that is cloud provider independent, thus allowing the auto-scaling of services over multiple cloud infrastructures at the same time. Our simulations, which are based on real service traces, show that our approach is capable of: (i) keeping the overall utilization of all the instantiated cloud resources in a target range, (ii) maintaining service response times close to the ones obtained using optimal centralized auto-scaling approaches.Comment: Submitted to Springer Computin

    Data Storage and Dissemination in Pervasive Edge Computing Environments

    Get PDF
    Nowadays, smart mobile devices generate huge amounts of data in all sorts of gatherings. Much of that data has localized and ephemeral interest, but can be of great use if shared among co-located devices. However, mobile devices often experience poor connectivity, leading to availability issues if application storage and logic are fully delegated to a remote cloud infrastructure. In turn, the edge computing paradigm pushes computations and storage beyond the data center, closer to end-user devices where data is generated and consumed. Hence, enabling the execution of certain components of edge-enabled systems directly and cooperatively on edge devices. This thesis focuses on the design and evaluation of resilient and efficient data storage and dissemination solutions for pervasive edge computing environments, operating with or without access to the network infrastructure. In line with this dichotomy, our goal can be divided into two specific scenarios. The first one is related to the absence of network infrastructure and the provision of a transient data storage and dissemination system for networks of co-located mobile devices. The second one relates with the existence of network infrastructure access and the corresponding edge computing capabilities. First, the thesis presents time-aware reactive storage (TARS), a reactive data storage and dissemination model with intrinsic time-awareness, that exploits synergies between the storage substrate and the publish/subscribe paradigm, and allows queries within a specific time scope. Next, it describes in more detail: i) Thyme, a data storage and dis- semination system for wireless edge environments, implementing TARS; ii) Parsley, a flexible and resilient group-based distributed hash table with preemptive peer relocation and a dynamic data sharding mechanism; and iii) Thyme GardenBed, a framework for data storage and dissemination across multi-region edge networks, that makes use of both device-to-device and edge interactions. The developed solutions present low overheads, while providing adequate response times for interactive usage and low energy consumption, proving to be practical in a variety of situations. They also display good load balancing and fault tolerance properties.Resumo Hoje em dia, os dispositivos móveis inteligentes geram grandes quantidades de dados em todos os tipos de aglomerações de pessoas. Muitos desses dados têm interesse loca- lizado e efêmero, mas podem ser de grande utilidade se partilhados entre dispositivos co-localizados. No entanto, os dispositivos móveis muitas vezes experienciam fraca co- nectividade, levando a problemas de disponibilidade se o armazenamento e a lógica das aplicações forem totalmente delegados numa infraestrutura remota na nuvem. Por sua vez, o paradigma de computação na periferia da rede leva as computações e o armazena- mento para além dos centros de dados, para mais perto dos dispositivos dos utilizadores finais onde os dados são gerados e consumidos. Assim, permitindo a execução de certos componentes de sistemas direta e cooperativamente em dispositivos na periferia da rede. Esta tese foca-se no desenho e avaliação de soluções resilientes e eficientes para arma- zenamento e disseminação de dados em ambientes pervasivos de computação na periferia da rede, operando com ou sem acesso à infraestrutura de rede. Em linha com esta dico- tomia, o nosso objetivo pode ser dividido em dois cenários específicos. O primeiro está relacionado com a ausência de infraestrutura de rede e o fornecimento de um sistema efêmero de armazenamento e disseminação de dados para redes de dispositivos móveis co-localizados. O segundo diz respeito à existência de acesso à infraestrutura de rede e aos recursos de computação na periferia da rede correspondentes. Primeiramente, a tese apresenta armazenamento reativo ciente do tempo (ARCT), um modelo reativo de armazenamento e disseminação de dados com percepção intrínseca do tempo, que explora sinergias entre o substrato de armazenamento e o paradigma pu- blicação/subscrição, e permite consultas num escopo de tempo específico. De seguida, descreve em mais detalhe: i) Thyme, um sistema de armazenamento e disseminação de dados para ambientes sem fios na periferia da rede, que implementa ARCT; ii) Pars- ley, uma tabela de dispersão distribuída flexível e resiliente baseada em grupos, com realocação preventiva de nós e um mecanismo de particionamento dinâmico de dados; e iii) Thyme GardenBed, um sistema para armazenamento e disseminação de dados em redes multi-regionais na periferia da rede, que faz uso de interações entre dispositivos e com a periferia da rede. As soluções desenvolvidas apresentam baixos custos, proporcionando tempos de res- posta adequados para uso interativo e baixo consumo de energia, demonstrando serem práticas nas mais diversas situações. Estas soluções também exibem boas propriedades de balanceamento de carga e tolerância a faltas

    Resilient Cloud-based Replication with Low Latency

    Full text link
    Existing approaches to tolerate Byzantine faults in geo-replicated environments require systems to execute complex agreement protocols over wide-area links and consequently are often associated with high response times. In this paper we address this problem with Spider, a resilient replication architecture for geo-distributed systems that leverages the availability characteristics of today's public-cloud infrastructures to minimize complexity and reduce latency. Spider models a system as a collection of loosely coupled replica groups whose members are hosted in different cloud-provided fault domains (i.e., availability zones) of the same geographic region. This structural organization makes it possible to achieve low response times by placing replica groups in close proximity to clients while still enabling the replicas of a group to interact over short-distance links. To handle the inter-group communication necessary for strong consistency Spider uses a reliable group-to-group message channel with first-in-first-out semantics and built-in flow control that significantly simplifies system design.Comment: 25 pages, extended version of Middleware 2020 pape
    corecore