2,074 research outputs found
Client-access protocols for replicated services
Published versio
LogBase: A Scalable Log-structured Database System in the Cloud
Numerous applications such as financial transactions (e.g., stock trading)
are write-heavy in nature. The shift from reads to writes in web applications
has also been accelerating in recent years. Write-ahead-logging is a common
approach for providing recovery capability while improving performance in most
storage systems. However, the separation of log and application data incurs
write overheads observed in write-heavy environments and hence adversely
affects the write throughput and recovery time in the system. In this paper, we
introduce LogBase - a scalable log-structured database system that adopts
log-only storage for removing the write bottleneck and supporting fast system
recovery. LogBase is designed to be dynamically deployed on commodity clusters
to take advantage of elastic scaling property of cloud environments. LogBase
provides in-memory multiversion indexes for supporting efficient access to data
maintained in the log. LogBase also supports transactions that bundle read and
write operations spanning across multiple records. We implemented the proposed
system and compared it with HBase and a disk-based log-structured
record-oriented system modeled after RAMCloud. The experimental results show
that LogBase is able to provide sustained write throughput, efficient data
access out of the cache, and effective system recovery.Comment: VLDB201
Managing Population and Workload Imbalance in Structured Overlays
Every day the number of data produced by networked devices increases. The current
paradigm is to offload the data produced to data centers to be processed. However as
more and more devices are offloading their data do cloud centers, accessing data becomes
increasingly more challenging. To combat this problem, systems are bringing data closer
to the consumer and distributing network responsibilities among the end devices. We are
witnessing a change in networking paradigm, where data storage and computation that
was once only handled in the cloud, is being processed by Internet of Things (IoT) and
mobile devices, thanks to the ever increasing technological capabilities of these devices.
One approach, leverages devices into a structured overlay network.
Structured Overlays are a common approach to address the organization and distri-
bution of data in peer-to-peer distributed systems. Due to their nature, indexing and
searching for elements of the system becomes trivial, thus structured overlays become
ideal building blocks of resource location based applications.
Such overlays assume that the data is distributed evenly over the peers, and that the
popularity of those data items is also evenly balanced. However in many systems, due to
many factors outside of the system domain, popularity may behave rather randomly, al-
lowing for some nodes to spare more resources looking for the popular items than others.
In this work we intend to exploit the properties of cluster-based structured overlays
propose to address this problem by improving a structure overlay with the mechanisms
to manage the population and workload imbalance and achieve more uniform use of
resources.
Our approach focus on implementing a Group-Based Distributed Hash Table (DHT)
capable of dynamically changing its groups to accommodate the changes in churn in the
network.
With the conclusion of our work we believe that we have indeed created a network
capable of withstanding high levels of churn, while ensuring fairness to all members of
the network.Todos os dias aumenta o número de dados produzidos por dispositivos em rede. O pa-
radigma atual é descarregar os dados produzidos para centros de dados para serem pro-
cessados. No entanto com o aumento do número de dispositivos a descarregar dados
para estes centros, o acesso aos dados torna-se cada vez mais desafiante. Para combater
este problema, os sistemas estão a aproximar os dados dos consumidores e a distribuir
responsabilidades de rede entre os dispositivos. Estamos a assistir a uma mudança no
paradigma de redes, onde o armazenamento de dados e a computação que antes eram da
responsabilidade dos centros de dados, está a ser processado por dispositivos móveis IoT,
graças às crescentes capacidades tecnológicas destes dispositivos. Uma abordagem, junta
os dispositivos em redes estruturadas.
As redes estruturadas são o meio mais comum de organizar e distribuir dados em
redes peer-to-peer. Gradas às suas propriedades, indexar e procurar por elementos torna-
se trivial, assim, as redes estruturadas tornam-se o bloco de construção ideal para sistemas
de procura de ficheiros.
Estas redes assumem que os dados estão distribuídos equitativamente por todos os
participantes e que todos esses dados são igualmente procurados. no entanto em muitos
sistemas, por factores externos a popularidade tem um comportamento volátil e imprevi-
sível sobrecarregando os participantes que guardam os dados mais populares.
Este trabalho tenta explorar as propriedades das redes estruturadas em grupo para
confrontar o problema, vamos equipar uma destas redes com os mecanismos necessários
para coordenar os participantes e a sua carga.
A nossa abordagem focasse na implementação de uma DHT baseado em grupos capaz
de alterar dinamicamente os grupos para acomodar as mudanças de membros da rede.
Com a conclusão de nosso trabalho, acreditamos que criamos uma rede capaz de
suportar altos níveis de instabilidade, enquanto garante justiça a todos os membros da
rede
DEPAS: A Decentralized Probabilistic Algorithm for Auto-Scaling
The dynamic provisioning of virtualized resources offered by cloud computing
infrastructures allows applications deployed in a cloud environment to
automatically increase and decrease the amount of used resources. This
capability is called auto-scaling and its main purpose is to automatically
adjust the scale of the system that is running the application to satisfy the
varying workload with minimum resource utilization. The need for auto-scaling
is particularly important during workload peaks, in which applications may need
to scale up to extremely large-scale systems.
Both the research community and the main cloud providers have already
developed auto-scaling solutions. However, most research solutions are
centralized and not suitable for managing large-scale systems, moreover cloud
providers' solutions are bound to the limitations of a specific provider in
terms of resource prices, availability, reliability, and connectivity.
In this paper we propose DEPAS, a decentralized probabilistic auto-scaling
algorithm integrated into a P2P architecture that is cloud provider
independent, thus allowing the auto-scaling of services over multiple cloud
infrastructures at the same time. Our simulations, which are based on real
service traces, show that our approach is capable of: (i) keeping the overall
utilization of all the instantiated cloud resources in a target range, (ii)
maintaining service response times close to the ones obtained using optimal
centralized auto-scaling approaches.Comment: Submitted to Springer Computin
Data Storage and Dissemination in Pervasive Edge Computing Environments
Nowadays, smart mobile devices generate huge amounts of data in all sorts of gatherings.
Much of that data has localized and ephemeral interest, but can be of great use if shared
among co-located devices. However, mobile devices often experience poor connectivity,
leading to availability issues if application storage and logic are fully delegated to a
remote cloud infrastructure. In turn, the edge computing paradigm pushes computations
and storage beyond the data center, closer to end-user devices where data is generated
and consumed. Hence, enabling the execution of certain components of edge-enabled
systems directly and cooperatively on edge devices.
This thesis focuses on the design and evaluation of resilient and efficient data storage
and dissemination solutions for pervasive edge computing environments, operating with
or without access to the network infrastructure. In line with this dichotomy, our goal can
be divided into two specific scenarios. The first one is related to the absence of network
infrastructure and the provision of a transient data storage and dissemination system
for networks of co-located mobile devices. The second one relates with the existence of
network infrastructure access and the corresponding edge computing capabilities.
First, the thesis presents time-aware reactive storage (TARS), a reactive data storage
and dissemination model with intrinsic time-awareness, that exploits synergies between
the storage substrate and the publish/subscribe paradigm, and allows queries within a
specific time scope. Next, it describes in more detail: i) Thyme, a data storage and dis-
semination system for wireless edge environments, implementing TARS; ii) Parsley, a
flexible and resilient group-based distributed hash table with preemptive peer relocation
and a dynamic data sharding mechanism; and iii) Thyme GardenBed, a framework
for data storage and dissemination across multi-region edge networks, that makes use of
both device-to-device and edge interactions.
The developed solutions present low overheads, while providing adequate response
times for interactive usage and low energy consumption, proving to be practical in a
variety of situations. They also display good load balancing and fault tolerance properties.Resumo
Hoje em dia, os dispositivos móveis inteligentes geram grandes quantidades de dados
em todos os tipos de aglomerações de pessoas. Muitos desses dados têm interesse loca-
lizado e efêmero, mas podem ser de grande utilidade se partilhados entre dispositivos
co-localizados. No entanto, os dispositivos móveis muitas vezes experienciam fraca co-
nectividade, levando a problemas de disponibilidade se o armazenamento e a lógica das
aplicações forem totalmente delegados numa infraestrutura remota na nuvem. Por sua
vez, o paradigma de computação na periferia da rede leva as computações e o armazena-
mento para além dos centros de dados, para mais perto dos dispositivos dos utilizadores
finais onde os dados são gerados e consumidos. Assim, permitindo a execução de certos
componentes de sistemas direta e cooperativamente em dispositivos na periferia da rede.
Esta tese foca-se no desenho e avaliação de soluções resilientes e eficientes para arma-
zenamento e disseminação de dados em ambientes pervasivos de computação na periferia
da rede, operando com ou sem acesso à infraestrutura de rede. Em linha com esta dico-
tomia, o nosso objetivo pode ser dividido em dois cenários específicos. O primeiro está
relacionado com a ausência de infraestrutura de rede e o fornecimento de um sistema
efêmero de armazenamento e disseminação de dados para redes de dispositivos móveis
co-localizados. O segundo diz respeito à existência de acesso à infraestrutura de rede e
aos recursos de computação na periferia da rede correspondentes.
Primeiramente, a tese apresenta armazenamento reativo ciente do tempo (ARCT), um
modelo reativo de armazenamento e disseminação de dados com percepção intrínseca
do tempo, que explora sinergias entre o substrato de armazenamento e o paradigma pu-
blicação/subscrição, e permite consultas num escopo de tempo específico. De seguida,
descreve em mais detalhe: i) Thyme, um sistema de armazenamento e disseminação de
dados para ambientes sem fios na periferia da rede, que implementa ARCT; ii) Pars-
ley, uma tabela de dispersão distribuída flexível e resiliente baseada em grupos, com
realocação preventiva de nós e um mecanismo de particionamento dinâmico de dados; e
iii) Thyme GardenBed, um sistema para armazenamento e disseminação de dados em
redes multi-regionais na periferia da rede, que faz uso de interações entre dispositivos e
com a periferia da rede.
As soluções desenvolvidas apresentam baixos custos, proporcionando tempos de res-
posta adequados para uso interativo e baixo consumo de energia, demonstrando serem
práticas nas mais diversas situações. Estas soluções também exibem boas propriedades de balanceamento de carga e tolerância a faltas
Resilient Cloud-based Replication with Low Latency
Existing approaches to tolerate Byzantine faults in geo-replicated
environments require systems to execute complex agreement protocols over
wide-area links and consequently are often associated with high response times.
In this paper we address this problem with Spider, a resilient replication
architecture for geo-distributed systems that leverages the availability
characteristics of today's public-cloud infrastructures to minimize complexity
and reduce latency. Spider models a system as a collection of loosely coupled
replica groups whose members are hosted in different cloud-provided fault
domains (i.e., availability zones) of the same geographic region. This
structural organization makes it possible to achieve low response times by
placing replica groups in close proximity to clients while still enabling the
replicas of a group to interact over short-distance links. To handle the
inter-group communication necessary for strong consistency Spider uses a
reliable group-to-group message channel with first-in-first-out semantics and
built-in flow control that significantly simplifies system design.Comment: 25 pages, extended version of Middleware 2020 pape
- …