Search CORE

42 research outputs found

WebWave: Globally Load Balanced Fully Distributed Caching of Hot Published Documents

Author: Heddaya Abdelsalam
Mirdad Sulaiman
Publication venue: Boston University Computer Science Department
Publication date: 10/10/1996
Field of study

Document publication service over such a large network as the Internet challenges us to harness available server and network resources to meet fast growing demand. In this paper, we show that large-scale dynamic caching can be employed to globally minimize server idle time, and hence maximize the aggregate server throughput of the whole service. To be efficient, scalable and robust, a successful caching mechanism must have three properties: (1) maximize the global throughput of the system, (2) find cache copies without recourse to a directory service, or to a discovery protocol, and (3) be completely distributed in the sense of operating only on the basis of local information. In this paper, we develop a precise definition, which we call tree load-balance (TLB), of what it means for a mechanism to satisfy these three goals. We present an algorithm that computes TLB off-line, and a distributed protocol that induces a load distribution that converges quickly to a TLB one. Both algorithms place cache copies of immutable documents, on the routing tree that connects the cached document's home server to its clients, thus enabling requests to stumble on cache copies en route to the home server.Harvard University; The Saudi Cultural Mission to the U.S.A

Boston University Institutional Repository (OpenBU)

An Analysis of Distributed Systems Syllabi With a Focus on Performance-Related Topics

Author: Abad Cristina L.
Boza Edwin F.
Iosup Alexandru
Ortiz-Holguin Eduardo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/03/2021
Field of study

We analyze a dataset of 51 current (2019-2020) Distributed Systems syllabi from top Computer Science programs, focusing on finding the prevalence and context in which topics related to performance are being taught in these courses. We also study the scale of the infrastructure mentioned in DS courses, from small client-server systems to cloud-scale, peer-to-peer, global-scale systems. We make eight main findings, covering goals such as performance, and scalability and its variant elasticity; activities such as performance benchmarking and monitoring; eight selected performance-enhancing techniques (replication, caching, sharding, load balancing, scheduling, streaming, migrating, and offloading); and control issues such as trade-offs that include performance and performance variability.Comment: Accepted for publication at WEPPE 2021, to be held in conjunction with ACM/SPEC ICPE 2021: https://doi.org/10.1145/3447545.3451197 This article is a follow-up of our prior ACM SIGCSE publication, arXiv:2012.0055

arXiv.org e-Print Archive

VU Research Portal

Reliable Replication at Low Cost

Author: Honeyman Peter
Zhang Jiaying
Publication venue: Center for Information Technology Integration
Publication date: 01/01/2006
Field of study

The emerging global scientific collaborations demand a scalable, efficient, reliable, and still convenient data access and management scheme. To fulfill these requirements, this paper describes a replicated file system that supports mutable (i.e., read/write) replication with strong consistency guarantees, small performance penalty, high failure resilience, and good scaling properties. The paper further evaluates the system using a real scientific application. The evaluation results show that the presented replication system can significantly improve the application's performance by reducing the first-time access latency to read the input data and by distributing the verification of data access to a nearby server. Furthermore, the penalty of file replication is negligible as long as applications use synchronous writes at a moderate rate.http://deepblue.lib.umich.edu/bitstream/2027.42/107950/1/citi-tr-06-2.pd

Deep Blue Documents at the University of Michigan

Replication Control in Distributed File Systems

Author: Honeyman Peter
Zhang Jiaying
Publication venue: Center for Information Technology Integration
Publication date: 01/04/2004
Field of study

We present a replication control protocol for distributed file systems that can guarantee strict consistency or sequential consistency while imposing no performance overhead for normal reads. The protocol uses a primary-copy scheme with server redirection when concurrent writes occur. It tolerates any number of component omission and performance failures, even when these lead to network partition. Failure detection and recovery are driven by client accesses. No heartbeat messages or expensive group communication services are required. We have implemented the protocol in NFSv4, the emerging Internet standard for distributed filing.http://deepblue.lib.umich.edu/bitstream/2027.42/107880/1/citi-tr-04-1.pd

Deep Blue Documents at the University of Michigan

Group Communication in Amoeba and its Applications

Author: Kaashoek M.F.
Tanenbaum A.S.
Verstoep K.
Publication venue
Publication date: 01/01/1993
Field of study

Unlike many other operating systems, Amoeba is a distributed operating system that provides group communication (i.e., one-to-many communication). We wil

CiteSeerX

VU Research Portal

State Machine Replication Is More Expensive Than Consensus

Author: Antoniadis Karolos
Guerraoui Rachid
Malkhi Dahlia
Seredinschi Dragos-Adrian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd International Symposium on Distributed Computing (DISC 2018)
Publication date: 01/01/2018
Field of study

Consensus and State Machine Replication (SMR) are generally considered to be equivalent problems. In certain system models, indeed, the two problems are computationally equivalent: any solution to the former problem leads to a solution to the latter, and vice versa. In this paper, we study the relation between consensus and SMR from a complexity perspective. We find that, surprisingly, completing an SMR command can be more expensive than solving a consensus instance. Specifically, given a synchronous system model where every instance of consensus always terminates in constant time, completing an SMR command does not necessarily terminate in constant time. This result naturally extends to partially synchronous models. Besides theoretical interest, our result also corresponds to practical phenomena we identify empirically. We experiment with two well-known SMR implementations (Multi-Paxos and Raft) and show that, indeed, SMR is more expensive than consensus in practice. One important implication of our result is that - even under synchrony conditions - no SMR algorithm can ensure bounded response times

Infoscience - École polytechnique fédérale de Lausanne

Dagstuhl Research Online Publication Server

A replicated file system for Grid computing

Author: Honeyman Peter
Zhang Jiaying
Publication venue: 'Wiley'
Publication date: 25/06/2008
Field of study

To meet the rigorous demands of large-scale data sharing in global collaborations, we present a replication scheme for NFSv4 that supports mutable replication without sacrificing strong consistency guarantees. Experimental evaluation indicates a substantial performance advantage over a single-server system. With the introduction of a hierarchical replication control protocol, the overhead of replication is negligible even when applications mostly write and replication servers are widely distributed. Evaluation with the NAS Grid Benchmarks demonstrates that our system provides comparable and often better performance than GridFTP, the de facto standard for Grid data sharing. Copyright © 2008 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/60228/1/1286_ftp.pd

Deep Blue Documents at the University of Michigan

Viewstamped Replication Revisited

Author: Cowling James
Liskov Barbara
Publication venue
Publication date: 23/07/2012
Field of study

This paper presents an updated version of Viewstamped Replication, a replication technique that handles failures in which nodes crash. It describes how client requests are handled, how the group reorganizes when a replica fails, and how a failed replica is able to rejoin the group. The paper also describes a number of important optimizations and presents a protocol for handling reconfigurations that can change both the group membership and the number of failures the group is able to handle

CiteSeerX

DSpace@MIT

Naming, Migration, and Replication for NFSv4

Author: Honeyman Peter
Zhang Jiaying
Publication venue: Center for Information Technology Integration
Publication date: 01/01/2006
Field of study

In this paper, we discuss a global name space for NFSv4 and mechanisms for transparent migration and replication. By convention, any file or directory name beginning with /nfs on an NFS client is part of this shared global name space. Our system supports file system migration and replication through DNS resolution, provides directory migration and replication using built-in NFSv4 mechanisms, and supports read/write replication with precise consistency guarantees, small performance penalty, and good scaling. We implement these features with small extensions to the published NFSv4 protocol, and demonstrate a practical way to enhance network transparency and administerability of NFSv4 in wide area networks.http://deepblue.lib.umich.edu/bitstream/2027.42/107939/1/citi-tr-06-1.pd

Deep Blue Documents at the University of Michigan