266 research outputs found
Crux: Locality-Preserving Distributed Services
Distributed systems achieve scalability by distributing load across many
machines, but wide-area deployments can introduce worst-case response latencies
proportional to the network's diameter. Crux is a general framework to build
locality-preserving distributed systems, by transforming an existing scalable
distributed algorithm A into a new locality-preserving algorithm ALP, which
guarantees for any two clients u and v interacting via ALP that their
interactions exhibit worst-case response latencies proportional to the network
latency between u and v. Crux builds on compact-routing theory, but generalizes
these techniques beyond routing applications. Crux provides weak and strong
consistency flavors, and shows latency improvements for localized interactions
in both cases, specifically up to several orders of magnitude for
weakly-consistent Crux (from roughly 900ms to 1ms). We deployed on PlanetLab
locality-preserving versions of a Memcached distributed cache, a Bamboo
distributed hash table, and a Redis publish/subscribe. Our results indicate
that Crux is effective and applicable to a variety of existing distributed
algorithms.Comment: 11 figure
Rethinking Distributed Caching Systems Design and Implementation
Distributed caching systems based on in-memory key-value stores have become a
crucial aspect of fast and efficient content delivery in modern web-applications. However,
due to the dynamic and skewed execution environments and workloads, under which
such systems typically operate, several problems arise in the form of load imbalance.
This thesis addresses the sources of load imbalance in caching systems, mainly: i) data
placement, which relates to distribution of data items across servers and ii) data item
access frequency, which describes amount of requests each server has to process, and how
each server is able to cope with it. Thus, providing several strategies to overcome the
sources of imbalance in isolation.
As a use case, we analyse Memcached, its variants, and propose a novel solution for
distributed caching systems. Our solution revolves around increasing parallelism through
load segregation, and solutions to overcome the load discrepancies when reaching high
saturation scenarios, mostly through access re-arrangement, and internal replication.Os sistemas de cache distribuídos baseados em armazenamento de pares chave-valor
em RAM, tornaram-se um aspecto crucial em aplicações web modernas para o fornecimento
rápido e eficiente de conteúdo. No entanto, estes sistemas normalmente estão
sujeitos a ambientes muito dinâmicos e irregulares. Este tipo de ambientes e irregularidades,
causa vários problemas, que emergem sob a forma de desequilíbrios de carga.
Esta tese aborda as diferentes origens de desequilíbrio de carga em sistemas de caching
distribuído, principalmente: i) colocação de dados, que se relaciona com a distribuição
dos dados pelos servidores e a ii) frequência de acesso aos dados, que reflete a quantidade
de pedidos que cada servidor deve processar e como cada servidor lida com a sua carga.
Desta forma, demonstramos várias estratégias para reduzir o impacto proveniente das
fontes de desequilíbrio, quando analizadas em isolamento.
Como caso de uso, analisamos o sistema Memcached, as suas variantes, e propomos
uma nova solução para sistemas de caching distribuídos. A nossa solução gira em torno
de aumento de paralelismo atraves de segregação de carga e em como superar superar as
discrepâncias de carga a quando de sistema entra em grande saturação, principalmente
atraves de reorganização de acesso e de replicação intern
Cache policies for cloud-based systems: To keep or not to keep
In this paper, we study cache policies for cloud-based caching. Cloud-based
caching uses cloud storage services such as Amazon S3 as a cache for data items
that would have been recomputed otherwise. Cloud-based caching departs from
classical caching: cloud resources are potentially infinite and only paid when
used, while classical caching relies on a fixed storage capacity and its main
monetary cost comes from the initial investment. To deal with this new context,
we design and evaluate a new caching policy that minimizes the overall cost of
a cloud-based system. The policy takes into account the frequency of
consumption of an item and the cloud cost model. We show that this policy is
easier to operate, that it scales with the demand and that it outperforms
classical policies managing a fixed capacity.Comment: Proceedings of IEEE International Conference on Cloud Computing 2014
(CLOUD 14
Dependability Evaluation of Middleware Technology for Large-scale Distributed Caching
Distributed caching systems (e.g., Memcached) are widely used by service
providers to satisfy accesses by millions of concurrent clients. Given their
large-scale, modern distributed systems rely on a middleware layer to manage
caching nodes, to make applications easier to develop, and to apply load
balancing and replication strategies. In this work, we performed a
dependability evaluation of three popular middleware platforms, namely
Twemproxy by Twitter, Mcrouter by Facebook, and Dynomite by Netflix, to assess
availability and performance under faults, including failures of Memcached
nodes and congestion due to unbalanced workloads and network link bandwidth
bottlenecks. We point out the different availability and performance trade-offs
achieved by the three platforms, and scenarios in which few faulty components
cause cascading failures of the whole distributed system.Comment: 2020 IEEE 31st International Symposium on Software Reliability
Engineering (ISSRE 2020
Write-Optimal Radix Tree: A Deterministic Indexing Structure for Persistent Memory Storage Systems
Department of Computer Science and EngineeringRecent interest in persistent memory (PM) has stirred development of index structures that are efficient in PM. Recent such developments have all focused on variations of the B-tree. In this paper, we show that the radix tree, which is another less popular indexing structure, can be more appropriate as an efficient PM indexing structure. This is because the radix tree structure is determined by the prefix of the inserted keys and also does not require tree rebalancing operations and node granularity updates. However, the radix tree as-is cannot be used in PM. As another contribution, we present three radix tree variants, namely, WORT (Write Optimal Radix Tree), WOART (Write Optimal Adaptive Radix Tree), and ART+CoW. Of these, the first two are optimal for PM in the sense that theyonly use one 8-byte failure-atomic write per update to guarantee the consistency of the structure and do not require any duplicate copies for logging or CoW. Extensive performance studies show that our proposed radix tree variants perform considerable better thanrecently proposed B-tree variants for PM such NVTree, wB+Tree, and FPTree for synthetic workloads as well as in implementations within Memcached.ope
High Performance Computing using Infiniband-based clusters
L'abstract è presente nell'allegato / the abstract is in the attachmen
Cache Serializability: Reducing Inconsistency in Edge Transactions
Read-only caches are widely used in cloud infrastructures to reduce access
latency and load on backend databases. Operators view coherent caches as
impractical at genuinely large scale and many client-facing caches are updated
in an asynchronous manner with best-effort pipelines. Existing solutions that
support cache consistency are inapplicable to this scenario since they require
a round trip to the database on every cache transaction.
Existing incoherent cache technologies are oblivious to transactional data
access, even if the backend database supports transactions. We propose T-Cache,
a novel caching policy for read-only transactions in which inconsistency is
tolerable (won't cause safety violations) but undesirable (has a cost). T-Cache
improves cache consistency despite asynchronous and unreliable communication
between the cache and the database. We define cache-serializability, a variant
of serializability that is suitable for incoherent caches, and prove that with
unbounded resources T-Cache implements this new specification. With limited
resources, T-Cache allows the system manager to choose a trade-off between
performance and consistency.
Our evaluation shows that T-Cache detects many inconsistencies with only
nominal overhead. We use synthetic workloads to demonstrate the efficacy of
T-Cache when data accesses are clustered and its adaptive reaction to workload
changes. With workloads based on the real-world topologies, T-Cache detects
43-70% of the inconsistencies and increases the rate of consistent transactions
by 33-58%.Comment: Ittay Eyal, Ken Birman, Robbert van Renesse, "Cache Serializability:
Reducing Inconsistency in Edge Transactions," Distributed Computing Systems
(ICDCS), IEEE 35th International Conference on, June~29 2015--July~2 201
- …