5,958 research outputs found

    Basis Token Consistency: A Practical Mechanism for Strong Web Cache Consistency

    Full text link
    With web caching and cache-related services like CDNs and edge services playing an increasingly significant role in the modern internet, the problem of the weak consistency and coherence provisions in current web protocols is becoming increasingly significant and drawing the attention of the standards community [LCD01]. Toward this end, we present definitions of consistency and coherence for web-like environments, that is, distributed client-server information systems where the semantics of interactions with resource are more general than the read/write operations found in memory hierarchies and distributed file systems. We then present a brief review of proposed mechanisms which strengthen the consistency of caches in the web, focusing upon their conceptual contributions and their weaknesses in real-world practice. These insights motivate a new mechanism, which we call "Basis Token Consistency" or BTC; when implemented at the server, this mechanism allows any client (independent of the presence and conformity of any intermediaries) to maintain a self-consistent view of the server's state. This is accomplished by annotating responses with additional per-resource application information which allows client caches to recognize the obsolescence of currently cached entities and identify responses from other caches which are already stale in light of what has already been seen. The mechanism requires no deviation from the existing client-server communication model, and does not require servers to maintain any additional per-client state. We discuss how our mechanism could be integrated into a fragment-assembling Content Management System (CMS), and present a simulation-driven performance comparison between the BTC algorithm and the use of the Time-To-Live (TTL) heuristic.National Science Foundation (ANI-9986397, ANI-0095988

    Object Distribution Networks for World-wide Document Circulation

    Get PDF
    This paper presents an Object Distribution System (ODS), a distributed system inspired by the ultra-large scale distribution models used in everyday life (e.g. food or newspapers distribution chains). Beyond traditional mechanisms of approaching information to readers (e.g. caching and mirroring), this system enables the publication, classification and subscription to volumes of objects (e.g. documents, events). Authors submit their contents to publication agents. Classification authorities provide classification schemes to classify objects. Readers subscribe to topics or authors, and retrieve contents from their local delivery agent (like a kiosk or library, with local copies of objects). Object distribution is an independent process where objects circulate asynchronously among distribution agents. ODS is designed to perform specially well in an increasingly populated, widespread and complex Internet jungle, using weak consistency replication by object distribution, asynchronous replication, and local access to objects by clients. ODS is based on two independent virtual networks, one dedicated to the distribution (replication) of objects and the other to calculate optimised distribution chains to be applied by the first network

    Cache policies for cloud-based systems: To keep or not to keep

    Full text link
    In this paper, we study cache policies for cloud-based caching. Cloud-based caching uses cloud storage services such as Amazon S3 as a cache for data items that would have been recomputed otherwise. Cloud-based caching departs from classical caching: cloud resources are potentially infinite and only paid when used, while classical caching relies on a fixed storage capacity and its main monetary cost comes from the initial investment. To deal with this new context, we design and evaluate a new caching policy that minimizes the overall cost of a cloud-based system. The policy takes into account the frequency of consumption of an item and the cloud cost model. We show that this policy is easier to operate, that it scales with the demand and that it outperforms classical policies managing a fixed capacity.Comment: Proceedings of IEEE International Conference on Cloud Computing 2014 (CLOUD 14

    Theory and Practice of Transactional Method Caching

    Get PDF
    Nowadays, tiered architectures are widely accepted for constructing large scale information systems. In this context application servers often form the bottleneck for a system's efficiency. An application server exposes an object oriented interface consisting of set of methods which are accessed by potentially remote clients. The idea of method caching is to store results of read-only method invocations with respect to the application server's interface on the client side. If the client invokes the same method with the same arguments again, the corresponding result can be taken from the cache without contacting the server. It has been shown that this approach can considerably improve a real world system's efficiency. This paper extends the concept of method caching by addressing the case where clients wrap related method invocations in ACID transactions. Demarcating sequences of method calls in this way is supported by many important application server standards. In this context the paper presents an architecture, a theory and an efficient protocol for maintaining full transactional consistency and in particular serializability when using a method cache on the client side. In order to create a protocol for scheduling cached method results, the paper extends a classical transaction formalism. Based on this extension, a recovery protocol and an optimistic serializability protocol are derived. The latter one differs from traditional transactional cache protocols in many essential ways. An efficiency experiment validates the approach: Using the cache a system's performance and scalability are considerably improved

    A Literature Survey of Cooperative Caching in Content Distribution Networks

    Full text link
    Content distribution networks (CDNs) which serve to deliver web objects (e.g., documents, applications, music and video, etc.) have seen tremendous growth since its emergence. To minimize the retrieving delay experienced by a user with a request for a web object, caching strategies are often applied - contents are replicated at edges of the network which is closer to the user such that the network distance between the user and the object is reduced. In this literature survey, evolution of caching is studied. A recent research paper [15] in the field of large-scale caching for CDN was chosen to be the anchor paper which serves as a guide to the topic. Research studies after and relevant to the anchor paper are also analyzed to better evaluate the statements and results of the anchor paper and more importantly, to obtain an unbiased view of the large scale collaborate caching systems as a whole.Comment: 5 pages, 5 figure

    Cache Serializability: Reducing Inconsistency in Edge Transactions

    Full text link
    Read-only caches are widely used in cloud infrastructures to reduce access latency and load on backend databases. Operators view coherent caches as impractical at genuinely large scale and many client-facing caches are updated in an asynchronous manner with best-effort pipelines. Existing solutions that support cache consistency are inapplicable to this scenario since they require a round trip to the database on every cache transaction. Existing incoherent cache technologies are oblivious to transactional data access, even if the backend database supports transactions. We propose T-Cache, a novel caching policy for read-only transactions in which inconsistency is tolerable (won't cause safety violations) but undesirable (has a cost). T-Cache improves cache consistency despite asynchronous and unreliable communication between the cache and the database. We define cache-serializability, a variant of serializability that is suitable for incoherent caches, and prove that with unbounded resources T-Cache implements this new specification. With limited resources, T-Cache allows the system manager to choose a trade-off between performance and consistency. Our evaluation shows that T-Cache detects many inconsistencies with only nominal overhead. We use synthetic workloads to demonstrate the efficacy of T-Cache when data accesses are clustered and its adaptive reaction to workload changes. With workloads based on the real-world topologies, T-Cache detects 43-70% of the inconsistencies and increases the rate of consistent transactions by 33-58%.Comment: Ittay Eyal, Ken Birman, Robbert van Renesse, "Cache Serializability: Reducing Inconsistency in Edge Transactions," Distributed Computing Systems (ICDCS), IEEE 35th International Conference on, June~29 2015--July~2 201

    Exact Analysis of TTL Cache Networks: The Case of Caching Policies driven by Stopping Times

    Full text link
    TTL caching models have recently regained significant research interest, largely due to their ability to fit popular caching policies such as LRU. This paper advances the state-of-the-art analysis of TTL-based cache networks by developing two exact methods with orthogonal generality and computational complexity. The first method generalizes existing results for line networks under renewal requests to the broad class of caching policies whereby evictions are driven by stopping times. The obtained results are further generalized, using the second method, to feedforward networks with Markov arrival processes (MAP) requests. MAPs are particularly suitable for non-line networks because they are closed not only under superposition and splitting, as known, but also under input-output caching operations as proven herein for phase-type TTL distributions. The crucial benefit of the two closure properties is that they jointly enable the first exact analysis of feedforward networks of TTL caches in great generality

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
    • 

    corecore