Search CORE

46,119 research outputs found

Understanding Sharded Caching Systems

Author: Pavlou G
Psaras I
Saino L
Publication venue: 35th IEEE Annual International Conference on Computer Communications (IEEE INFOCOM)
Publication date: 28/07/2016
Field of study

Sharding is a method for allocating data items to nodes of a distributed caching or storage system based on the result of a hash function computed on the item identifier. It is ubiquitously used in key-value stores, CDNs and many other applications. Despite considerable work has focused on the design and the implementation of such systems, there is limited understanding of their performance in realistic operational conditions from a theoretical standpoint. In this paper we fill this gap by providing a thorough modeling of sharded caching systems, focusing particularly on load balancing and caching performance aspects. Our analysis provides important insights that can be applied to optimize the design and configuration of sharded caching systems

UCL Discovery

Load Imbalance and Caching Performance of Sharded Systems

Author: Leonardi Emilio
Pavlou George
Psaras Ioannis
Saino Lorenzo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2020
Field of study

Sharding is a method for allocating data items to nodes of a distributed caching or storage system based on the result of a hash function computed on the item’s identifier. It is ubiquitously used in key-value stores, CDNs and many other applications. Despite considerable work that has focused on the design and implementation of such systems, there is limited understanding of their performance in realistic operational conditions from a theoretical standpoint. In this paper we fill this gap by providing a thorough modeling of sharded caching systems, focusing particularly on load balancing and caching performance aspects. Our analysis provides important insights that can be applied to optimize the design and configuration of sharded caching systems

UCL Discovery

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Fast Consistent Hashing in Constant Time

Author: Leu Eric
Publication venue
Publication date: 23/07/2023
Field of study

Consistent hashing is a technique that can minimize key remapping when the number of hash buckets changes. The paper proposes a fast consistent hash algorithm (called power consistent hash) that has

O(1)

expected time for key lookup, independent of the number of buckets. Hash values are computed in real time. No search data structure is constructed to store bucket ranges or key mappings. The algorithm has a lightweight design using

O(1)

space with superior scalability. In particular, it uses two auxiliary hash functions to achieve distribution uniformity and

O(1)

expected time for key lookup. Furthermore, it performs consistent hashing such that only a minimal number of keys are remapped when the number of buckets changes. Consistent hashing has a wide range of use cases, including load balancing, distributed caching, and distributed key-value stores. The proposed algorithm is faster than well-known consistent hash algorithms with

O(\log n)

lookup time.Comment: 11 pages, 2 figure

arXiv.org e-Print Archive

An Analysis of Load Imbalance in Scale-out Data Serving

Author: Alexandros Daglis
Babak Falsafi
Boris Grot
Breslau L.
Edouard Bugnion
Stanko Novakovic
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/04/2016
Field of study

Despite the natural parallelism across lookups, performance of distributed key-value stores is often limited due to load imbalance induced by heavy skew in the popularity distribution of the dataset. To avoid violating service level objectives expressed in terms of tail latency, systems tend to keep server utilization low and organize the data in micro-shards, which in turn provides units of migration and replication for the purpose of load balancing. These techniques reduce the skew, but incur additional monitoring, data replication and consistency maintenance overheads. This work shows that the trend towards extreme scale-out will further exacerbate the skew-induced load imbalance, and hence the overhead of migration and replication

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Edinburgh Research Explorer

Mitigating Load Imbalance in Distributed Data Serving with Rack-Scale Memory Pooling

Author: Bugnion Edouard
Daglis Alexandros
Falsafi Babak
Grot Boris
Novakovic Stanko
Ustiugov Dmitrii
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/04/2019
Field of study

To provide low-latency and high-throughput guarantees, most large key-value stores keep the data in the memory of many servers. Despite the natural parallelism across lookups, the load imbalance, introduced by heavy skew in the popularity distribution of keys, limits performance. To avoid violating tail latency service-level objectives, systems tend to keep server utilization low and organize the data in micro-shards, which provides units of migration and replication for the purpose of load balancing. These techniques reduce the skew but incur additional monitoring, data replication, and consistency maintenance overheads. In this work, we introduce RackOut, a memory pooling technique that leverages the one-sided remote read primitive of emerging rack-scale systems to mitigate load imbalance while respecting service-level objectives. In RackOut, the data are aggregated at rack-scale granularity, with all of the participating servers in the rack jointly servicing all of the rack’s micro-shards. We develop a queuing model to evaluate the impact of RackOut at the datacenter scale. In addition, we implement a RackOut proof-of-concept key value store, evaluate it on two experimental platforms based on RDMA and Scale-Out NUMA, and use these results to validate the model. We devise two distinct approaches to load balancing within a RackOut unit, one based on random selection of nodes—RackOut_static—and another one based on an adaptive load balancing mechanism— RackOut_adaptive. Our results show that RackOut_static increases throughput by up to 6× for RDMA and 8.6× for Scale-Out NUMA compared to a scale-out deployment, while respecting tight tail latency service-level objectives. RackOut_adaptive improves the throughput by 30% for workloads with 20% of writes over RackOut_static

Infoscience - École polytechnique fédérale de Lausanne

CATS: linearizability and partition tolerance in scalable and self-organizing key-value stores

Author: Arad Cosmin
Haridi Seif
Shafaat Tallat M.
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/2012
Field of study

Distributed key-value stores provide scalable, fault-tolerant, and self-organizing storage services, but fall short of guaranteeing linearizable consistency in partially synchronous, lossy, partitionable, and dynamic networks, when data is distributed and replicated automatically by the principle of consistent hashing. This paper introduces consistent quorums as a solution for achieving atomic consistency. We present the design and implementation of CATS, a distributed key-value store which uses consistent quorums to guarantee linearizability and partition tolerance in such adverse and dynamic network conditions. CATS is scalable, elastic, and self-organizing; key properties for modern cloud storage middleware. Our system shows that consistency can be achieved with practical performance and modest throughput overhead (5%) for read-intensive workloads

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Energy storage in the UK electrical network : estimation of the scale and review of technology options

Author: Hall Peter J.
McGregor Peter G.
Wilson Ian Allan Grant
Publication venue: 'Elsevier BV'
Publication date: 01/08/2010
Field of study

This paper aims to clarify the difference between stores of energy in the form of non-rechargeable stores of energy such as fossil-fuels, and the storage of electricity by devices that are rechargeable. The existing scale of these two distinct types of storage is considered in the UK context, followed by a review of rechargeable technology options. The storage is found to be overwhelmingly contained within the fossil-fuel stores of conventional generators, but their scale is thought to be determined by the risks associated with long supply chains and price variability. The paper also aims to add to the debate regarding the need to have more flexible supply and demand available within the UK electrical network in order to balance the expected increase of wind derived generation. We conclude that the decarbonisation challenge facing the UK electricity sector should be seen not only as a supply and demand challenge but also as a storage challenge. (c) 2010 Elsevier Ltd. All rights reserved

University of Strathclyde Institutional Repository

University of Birmingham Research Portal

White Rose Research Online