36 research outputs found
Cloud-edge hybrid applications
Many modern applications are designed to provide interactions among users, including multi-
user games, social networks and collaborative tools. Users expect application response time to
be in the order of milliseconds, to foster interaction and interactivity.
The design of these applications typically adopts a client-server model, where all interac-
tions are mediated by a centralized component. This approach introduces availability and fault-
tolerance issues, which can be mitigated by replicating the server component, and even relying on
geo-replicated solutions in cloud computing infrastructures. Even in this case, the client-server
communication model leads to unnecessary latency penalties for geographically close clients and
high operational costs for the application provider.
This dissertation proposes a cloud-edge hybrid model with secure and ecient propagation
and consistency mechanisms. This model combines client-side replication and client-to-client
propagation for providing low latency and minimizing the dependency on the server infras-
tructure, fostering availability and fault tolerance. To realize this model, this works makes the
following key contributions.
First, the cloud-edge hybrid model is materialized by a system design where clients maintain
replicas of the data and synchronize in a peer-to-peer fashion, and servers are used to assist
clients’ operation. We study how to bring most of the application logic to the client-side, us-
ing the centralized service primarily for durability, access control, discovery, and overcoming
internetwork limitations.
Second, we dene protocols for weakly consistent data replication, including a novel CRDT
model (∆-CRDTs). We provide a study on partial replication, exploring the challenges and
fundamental limitations in providing causal consistency, and the diculty in supporting client-
side replicas due to their ephemeral nature.
Third, we study how client misbehaviour can impact the guarantees of causal consistency.
We propose new secure weak consistency models for insecure settings, and algorithms to enforce
such consistency models.
The experimental evaluation of our contributions have shown their specic benets and
limitations compared with the state-of-the-art. In general, the cloud-edge hybrid model leads to
faster application response times, lower client-to-client latency, higher system scalability as fewer clients need to connect to servers at the same time, the possibility to work oine or disconnected
from the server, and reduced server bandwidth usage.
In summary, we propose a hybrid of cloud-and-edge which provides lower user-to-user la-
tency, availability under server disconnections, and improved server scalability – while being
ecient, reliable, and secure.Muitas aplicações modernas são criadas para fornecer interações entre utilizadores, incluindo
jogos multiutilizador, redes sociais e ferramentas colaborativas. Os utilizadores esperam que o
tempo de resposta nas aplicações seja da ordem de milissegundos, promovendo a interação e
interatividade.
A arquitetura dessas aplicações normalmente adota um modelo cliente-servidor, onde todas as
interações são mediadas por um componente centralizado. Essa abordagem apresenta problemas
de disponibilidade e tolerância a falhas, que podem ser mitigadas com replicação no componente
do servidor, até com a utilização de soluções replicadas geogracamente em infraestruturas de
computação na nuvem. Mesmo neste caso, o modelo de comunicação cliente-servidor leva a
penalidades de latência desnecessárias para clientes geogracamente próximos e altos custos
operacionais para o provedor das aplicações.
Esta dissertação propõe um modelo híbrido cloud-edge com mecanismos seguros e ecientes
de propagação e consistência. Esse modelo combina replicação do lado do cliente e propagação
de cliente para cliente para fornecer baixa latência e minimizar a dependência na infraestrutura
do servidor, promovendo a disponibilidade e tolerância a falhas. Para realizar este modelo, este
trabalho faz as seguintes contribuições principais.
Primeiro, o modelo híbrido cloud-edge é materializado por uma arquitetura do sistema em
que os clientes mantêm réplicas dos dados e sincronizam de maneira ponto a ponto e onde os
servidores são usados para auxiliar na operação dos clientes. Estudamos como trazer a maior
parte da lógica das aplicações para o lado do cliente, usando o serviço centralizado principalmente
para durabilidade, controlo de acesso, descoberta e superação das limitações inter-rede.
Em segundo lugar, denimos protocolos para replicação de dados fracamente consistentes,
incluindo um novo modelo de CRDTs (∆-CRDTs). Fornecemos um estudo sobre replicação parcial,
explorando os desaos e limitações fundamentais em fornecer consistência causal e a diculdade
em suportar réplicas do lado do cliente devido à sua natureza efémera.
Terceiro, estudamos como o mau comportamento da parte do cliente pode afetar as garantias
da consistência causal. Propomos novos modelos seguros de consistência fraca para congurações
inseguras e algoritmos para impor tais modelos de consistência.
A avaliação experimental das nossas contribuições mostrou os benefícios e limitações em comparação com o estado da arte. Em geral, o modelo híbrido cloud-edge leva a tempos de resposta
nas aplicações mais rápidos, a uma menor latência de cliente para cliente e à possibilidade de
trabalhar oine ou desconectado do servidor. Adicionalmente, obtemos uma maior escalabilidade
do sistema, visto que menos clientes precisam de estar conectados aos servidores ao mesmo tempo
e devido à redução na utilização da largura de banda no servidor.
Em resumo, propomos um modelo híbrido entre a orla (edge) e a nuvem (cloud) que fornece
menor latência entre utilizadores, disponibilidade durante desconexões do servidor e uma melhor
escalabilidade do servidor – ao mesmo tempo que é eciente, conável e seguro
Dinamičko formiranje distribuiranog mikro okruženja u računarstvu u oblaku
This thesis presents research in the field of distributed systems. We present the dynamic organization of geodistributed edge nodes into micro data-centers forming micro clouds to cover any arbitrary area and expand capacity, availability, and reliability. A cloud organization is used as an influence with adaptations for a different environment with a clear separation of concerns, and native applications model that can leverage the newly formed system. With the separation of concerns setup, edge-native applications model, and a unified node organization, we are moving towards the idea of edge computing as a service, like any other utility in cloud computing. We also give formal models for all protocols used for the creation of such a system.U sklopu disertacije izvršeno je istraživanje u oblasti distribuiranih sistema. Predstavili smo dinamičku organizaciju geo-distribuiranih čvorova u mikro centre za obradu podataka koji formiraju mikro okruženja računarstva u oblaku kako bi pokrili bilo koje proizvoljno područje i proširili kapacitet, dostupnost i pouzdanost. Koristili smo organizaciju računarstva u oblaku kao inspiraciju, sa adaptacijama za drugačije okruženje sa jasnom podelom nadležnosti, i modelom aplikacija koji može da iskoristi novoformirani sistem. Jasna podela nadležnosti, model aplikacija i dinamička organizacijom čvorova, čine da se predstavljeni model ponude kao i bilo koji drugi uslužni servis. Takođe dajemo formalne modele za sve protokole koji se koriste za stvaranje takvog sistema
Programming Languages and Systems
This open access book constitutes the proceedings of the 31st European Symposium on Programming, ESOP 2022, which was held during April 5-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 21 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. They deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems
Designing the replication layer of a general-purpose datacenter key-value store
Online services and cloud applications such as graph applications, messaging systems,
coordination services, HPC applications, social networks and deep learning rely on
key-value stores (KVSes), in order to reliably store and quickly retrieve data. KVSes
are NoSQL Databases with a read/write/read-modify-write API. KVSes replicate their
dataset in a few servers, such that the KVS can continue operating in the presence of
faults (availability). To allow programmers to reason about replication, KVSes specify
a set of rules (consistency), which are enforced through the use of replication protocols.
These rules must be intuitive to facilitate programmer productivity (programmability).
A general-purpose KVS must maximize the number of operations executed per
unit of time within a predetermined latency (performance) without compromising on
consistency, availability or programmability. However, all three of these guarantees
are at odds with performance. In this thesis, we explore the design of the replication
layer of a general-purpose KVS, which is responsible for navigating this trade-off, by
specifying and enforcing the consistency and availability guarantees of the KVS.
We start the exploration by observing that modern, server-grade hardware with
manycore servers and RDMA-capable networks, challenges conventional wisdom in
protocol design. In order to investigate the impact of these advances on protocols and
their design, we first create an informal taxonomy of strongly-consistent replication
protocols. We focus on strong consistency semantics because they are necessary for a
general-purpose KVS and they are at odds with performance. Based on this taxonomy
we carefully select 10 protocols for analysis. Secondly, we present Odyssey, a frame-work tailored towards protocol implementation for multi-threaded, RDMA-enabled,
in-memory, replicated KVSes. Using Odyssey, we characterize the design space of
strongly-consistent replication protocols, by building, evaluating and comparing the
10 protocols.
Our evaluation demonstrates that some of the protocols that were efficient in yesterday’s hardware are not so today because they cannot take advantage of the abundant
parallelism and fast networking present in modern hardware. Conversely, some protocols that were inefficient in yesterday’s hardware are very attractive today. We distil
our findings in a concise set of general guidelines and recommendations for protocol
selection and design in the era of modern hardware.
The second step of our exploration focuses on the tension between consistency
and performance. The problem is that expensive strongly-consistent primitives are
necessary to achieve synchronization, but in typical applications only a small fraction
of accesses is actually used for synchronization. To navigate this trade-off, we advocate
the adoption of Release Consistency (RC) for KVSes. We argue that RC’s one-sided
barriers are ideal for capturing the ordering relationship between synchronization and
non-synchronization accesses while enabling high performance.
We present Kite, a general-purpose, replicated KVS that enforces RC through a
novel fast/slow path mechanism that leverages the absence of failures in the typical
case to maximize performance, while relying on the slow path for progress. In ad dition, Kite leverages our study of replication protocols to select the most suitable
protocols for its primitives and is implemented over Odyssey to make the most out of
modern hardware. Finally, Kite does not compromise on consistency, availability or
programmability, as it provides sufficient primitives to implement any algorithm (consistency), does not interrupt its operation on a failure (availability), and offers the RC
API that programmers are already familiar with (programmability)
Reducing the Latency of Dependent Operations in Large-Scale Geo-Distributed Systems
Many applications rely on large-scale distributed systems for data management and computation. These distributed systems are complex and built from different networked services. Dependencies between these services can create a chain of dependent network I/O operations that have to be executed sequentially. This can result in high service latencies, especially when the chain consists of inter-datacenter operations.
To address the latency problem of executing dependent network I/O operations, this thesis introduces new approaches and techniques to reduce the required number of operations that have to be executed sequentially for three system types. First, it addresses the high transaction completion time in geo-distributed database systems that have data sharded and replicated across different geographical regions. For a single transaction, most existing systems sequentially execute reads, writes, 2PC, and a replication protocol because of dependencies between these parts. This thesis looks at using a more restrictive transaction model in order to break dependencies and allow different parts to execute in parallel.
Second, dependent network I/O operations also lead to high latency for performing leader-based state machine replication across a wide-area network. Fast Paxos introduces a fast path that bypasses the leader for request ordering. However, when concurrent requests arrive at replicas in different orders, the fast path may fail, and Fast Paxos has to fall back to a slow path. This thesis explores the use of network measurements to establish a global order for requests across replicas, allowing Fast Paxos to be effective for concurrent requests.
Finally, this thesis proposes a general solution to reduce the latency impact of dependent operations in distributed systems through the use of speculative execution. For many systems, domain knowledge can be used to predict an operation’s result and speculatively execute subsequent operations, potentially allowing a chain of dependent operations to execute in parallel. This thesis introduces a framework that provides system-level support for performing speculative network I/O operations.
These three approaches reduce the number of sequentially performed network I/O operations in different domains. Our performance evaluation shows that they can significantly reduce the latency of critical infrastructure services, allowing these services to be used by latency-sensitive applications
Préserver la vie privée des individus grâce aux Systèmes Personnels de Gestion des Données
Riding the wave of smart disclosure initiatives and new privacy-protection regulations, the Personal Cloud paradigm is emerging through a myriad of solutions offered to users to let them gather and manage their whole digital life. On the bright side, this opens the way to novel value-added services when crossing multiple sources of data of a given person or crossing the data of multiple people. Yet this paradigm shift towards user empowerment raises fundamental questions with regards to the appropriateness of the functionalities and the data management and protection techniques which are offered by existing solutions to laymen users. Our work addresses these questions on three levels. First, we review, compare and analyze personal cloud alternatives in terms of the functionalities they provide and the threat models they target. From this analysis, we derive a general set of functionality and security requirements that any Personal Data Management System (PDMS) should consider. We then identify the challenges of implementing such a PDMS and propose a preliminary design for an extensive and secure PDMS reference architecture satisfying the considered requirements. Second, we focus on personal computations for a specific hardware PDMS instance (i.e., secure token with mass storage of NAND Flash). In this context, we propose a scalable embedded full-text search engine to index large document collections and manage tag-based access control policies. Third, we address the problem of collective computations in a fully-distributed architecture of PDMSs. We discuss the system and security requirements and propose protocols to enable distributed query processing with strong security guarantees against an attacker mastering many colluding corrupted nodes.Surfant sur la vague des initiatives de divulgation restreinte de données et des nouvelles réglementations en matière de protection de la vie privée, le paradigme du Cloud Personnel émerge à travers une myriade de solutions proposées aux utilisateurs leur permettant de rassembler et de gérer l'ensemble de leur vie numérique. Du côté positif, cela ouvre la voie à de nouveaux services à valeur ajoutée lors du croisement de plusieurs sources de données d'un individu ou du croisement des données de plusieurs personnes. Cependant, ce changement de paradigme vers la responsabilisation de l'utilisateur soulève des questions fondamentales quant à l'adéquation des fonctionnalités et des techniques de gestion et de protection des données proposées par les solutions existantes aux utilisateurs lambda. Notre travail aborde ces questions à trois niveaux. Tout d'abord, nous passons en revue, comparons et analysons les alternatives de cloud personnel au niveau des fonctionnalités fournies et des modèles de menaces ciblés. De cette analyse, nous déduisons un ensemble général d'exigences en matière de fonctionnalité et de sécurité que tout système personnel de gestion des données (PDMS) devrait prendre en compte. Nous identifions ensuite les défis liés à la mise en œuvre d'un tel PDMS et proposons une conception préliminaire pour une architecture PDMS étendue et sécurisée de référence répondant aux exigences considérées. Ensuite, nous nous concentrons sur les calculs personnels pour une instance matérielle spécifique du PDMS (à savoir, un dispositif personnel sécurisé avec un stockage de masse de type NAND Flash). Dans ce contexte, nous proposons un moteur de recherche plein texte embarqué et évolutif pour indexer de grandes collections de documents et gérer des politiques de contrôle d'accès basées sur des étiquettes. Troisièmement, nous abordons le problème des calculs collectifs dans une architecture entièrement distribuée de PDMS. Nous discutons des exigences d'architectures système et de sécurité et proposons des protocoles pour permettre le traitement distribué des requêtes avec de fortes garanties de sécurité contre un attaquant maîtrisant de nombreux nœuds corrompus
A decentralized framework for cross administrative domain data sharing
Federation of messaging and storage platforms located in remote datacenters is an essential functionality to share data among geographically distributed platforms. When systems are administered by the same owner data replication reduces data access latency bringing data closer to applications and enables fault tolerance to face disaster recovery of an entire location. When storage platforms are administered by different owners data replication across different administrative domains is essential for enterprise application data integration. Contents and services managed by different software platforms need to be integrated to provide richer contents and services. Clients may need to share subsets of data in order to enable collaborative analysis and service integration. Platforms usually include proprietary federation functionalities and specific APIs to let external software and platforms access their internal data. These different techniques may not be applicable to all environments and networks due to security and technological restrictions. Moreover the federation of dispersed nodes under a decentralized administration scheme is still a research issue. This thesis is a contribution along this research direction as it introduces and describes a framework, called \u201cWideGroups\u201d, directed towards the creation and the management of an automatic federation and integration of widely dispersed platform nodes. It is based on groups to exchange messages among distributed applications located in different remote datacenters. Groups are created and managed using client side programmatic configuration without touching servers. WideGroups enables the extension of the software platform services to nodes belonging to different administrative domains in a wide area network environment. It lets different nodes form ad-hoc overlay networks on-the-fly depending on message destinations located in distinct administrative domains. It supports multiple dynamic overlay networks based on message groups, dynamic discovery of nodes and automatic setup of overlay networks among nodes with no server-side configuration. I designed and implemented platform connectors to integrate the framework as the federation module of Message Oriented Middleware and Key Value Store platforms, which are among the most widespread paradigms supporting data sharing in distributed systems
XXV Congreso Argentino de Ciencias de la Computación - CACIC 2019: libro de actas
Trabajos presentados en el XXV Congreso Argentino de Ciencias de la Computación (CACIC), celebrado en la ciudad de Río Cuarto los días 14 al 18 de octubre de 2019 organizado por la Red de Universidades con Carreras en Informática (RedUNCI) y Facultad de Ciencias Exactas, Físico-Químicas y Naturales - Universidad Nacional de Río CuartoRed de Universidades con Carreras en Informátic
XXV Congreso Argentino de Ciencias de la Computación - CACIC 2019: libro de actas
Trabajos presentados en el XXV Congreso Argentino de Ciencias de la Computación (CACIC), celebrado en la ciudad de Río Cuarto los días 14 al 18 de octubre de 2019 organizado por la Red de Universidades con Carreras en Informática (RedUNCI) y Facultad de Ciencias Exactas, Físico-Químicas y Naturales - Universidad Nacional de Río CuartoRed de Universidades con Carreras en Informátic
Recommended from our members
From Controlled Data-Center Environments to Open Distributed Environments: Scalable, Efficient, and Robust Systems with Extended Functionality
The past two decades have witnessed several paradigm shifts in computing environments. Starting from cloud computing which offers on-demand allocation of storage, network, compute, and memory resources, as well as other services, in a pay-as-you-go billingmodel. Ending with the rise of permissionless blockchain technology, a decentralized computing paradigm with lower trust assumptions and limitless number of participants. Unlike in the cloud, where all the computing resources are owned by some trusted cloud provider, permissionless blockchains allow computing resources owned by possibly malicious parties to join and leave their network without obtaining permission from some centralized trusted authority. Still, in the presence of malicious parties, permissionlessblockchain networks can perform general computations and make progress. Cloud computing is powered by geographically distributed data-centers controlled and managed by trusted cloud service providers and promises theoretically infinite computing resources. On the other hand, permissionless blockchains are powered by open networks of geographically distributed computing nodes owned by entities that are not necessarily known or trusted. This paradigm shift requires a reconsideration of distributed data management protocols and distributed system designs that assume low latency across system components, inelastic computing resources, or fully trusted computing resources.In this dissertation, we propose new system designs and optimizations that address scalability and efficiency of distributed data management systems in cloud environments. We also propose several protocols and new programming paradigms to extend the functionality and enhance the robustness of permissionless blockchains. The work presented spans global-scale transaction processing, large-scale stream processing, atomic transaction processing across permissionless blockchains, and extending the functionality and the use-cases of permissionless blockchains. In all these directions, the focus is on rethinking system and protocol designs to account for novel cloud and permissionless blockchain assumptions. For global-scale transaction processing, we propose GPlacer, a placement optimization framework that decides replica placement of fully and partial geo-replicated databases. For large-scale stream processing, we propose Cache-on-Track (CoT) an adaptive and elastic client-side cache that addresses server-side load-imbalances that occur in large-scale distributed storage layers. In permissionless blockchain transaction processing, we propose AC3WN, the first correct cross-chain commitment protocol that guarantees atomicity of cross-chain transactions. Also, we propose TXSC, a transactional smart contract programming framework. TXSC provides smart contract developers with transaction primitives. These primitives allow developers to write smart contracts without the need to reason about the anomalies that can arise due to concurrent smart contract function executions. In addition, we propose a forward-looking architecture that unifies both permissioned and permissionless blockchains and exploits the running infrastructure of permissionless blockchains to build global asset management systems