311 research outputs found

    Towards Soft Circuit Breaking in Service Meshes via Application-agnostic Caching

    Full text link
    Service meshes factor out code dealing with inter-micro-service communication, such as circuit breaking. Circuit breaking actuation is currently limited to an "on/off" switch, i.e., a tripped circuit breaker will return an application-level error indicating service unavailability to the calling micro-service. This paper proposes a soft circuit breaker actuator, which returns cached data instead of an error. The overall resilience of a cloud application is improved if constituent micro-services return stale data, instead of no data at all. While caching is widely employed for serving web service traffic, its usage in inter-micro-service communication is lacking. Micro-services responses are highly dynamic, which requires carefully choosing adaptive time-to-life caching algorithms. We evaluate our approach through two experiments. First, we quantify the trade-off between traffic reduction and data staleness using a purpose-build service, thereby identifying algorithm configurations that keep data staleness at about 3% or less while reducing network load by up to 30%. Second, we quantify the network load reduction with the micro-service benchmark by Google Cloud called Hipster Shop. Our approach results in caching of about 80% of requests. Results show the feasibility and efficiency of our approach, which encourages implementing caching as a circuit breaking actuator in service meshes

    Adaptive and Application-agnostic Caching in Service Meshes for Resilient Cloud Applications

    Get PDF
    Service meshes factor out code dealing with inter-micro-service communication. The overall resilience of a cloud application is improved if constituent micro-services return stale data, instead of no data at all. This paper proposes and implements application agnostic caching for micro services. While caching is widely employed for serving web service traffic, its usage in inter-micro-service communication is lacking. Micro-services responses are highly dynamic, which requires carefully choosing adaptive time-to-life caching algorithms. Our approach is application agnostic, is cloud native, and supports gRPC. We evaluate our approach and implementation using the micro-service benchmark by Google Cloud called Hipster Shop. Our approach results in caching of about 80% of requests. Results show the feasibility and efficiency of our approach, which encourages implementing caching in service meshes. Additionally, we make the code, experiments, and data publicly available

    A Survey of Deep Learning for Data Caching in Edge Network

    Full text link
    The concept of edge caching provision in emerging 5G and beyond mobile networks is a promising method to deal both with the traffic congestion problem in the core network as well as reducing latency to access popular content. In that respect end user demand for popular content can be satisfied by proactively caching it at the network edge, i.e, at close proximity to the users. In addition to model based caching schemes learning-based edge caching optimizations has recently attracted significant attention and the aim hereafter is to capture these recent advances for both model based and data driven techniques in the area of proactive caching. This paper summarizes the utilization of deep learning for data caching in edge network. We first outline the typical research topics in content caching and formulate a taxonomy based on network hierarchical structure. Then, a number of key types of deep learning algorithms are presented, ranging from supervised learning to unsupervised learning as well as reinforcement learning. Furthermore, a comparison of state-of-the-art literature is provided from the aspects of caching topics and deep learning methods. Finally, we discuss research challenges and future directions of applying deep learning for cachin

    Cacheability study for web content delivery

    Get PDF
    Master'sMASTER OF SCIENC

    Techniques of data prefetching, replication, and consistency in the Internet

    Get PDF
    Internet has become a major infrastructure for information sharing in our daily life, and indispensable to critical and large applications in industry, government, business, and education. Internet bandwidth (or the network speed to transfer data) has been dramatically increased, however, the latency time (or the delay to physically access data) has been reduced in a much slower pace. The rich bandwidth and lagging latency can be effectively coped with in Internet systems by three data management techniques: caching, replication, and prefetching. The focus of this dissertation is to address the latency problem in Internet by utilizing the rich bandwidth and large storage capacity for efficiently prefetching data to significantly improve the Web content caching performance, by proposing and implementing scalable data consistency maintenance methods to handle Internet Web address caching in distributed name systems (DNS), and to handle massive data replications in peer-to-peer systems. While the DNS service is critical in Internet, peer-to-peer data sharing is being accepted as an important activity in Internet.;We have made three contributions in developing prefetching techniques. First, we have proposed an efficient data structure for maintaining Web access information, called popularity-based Prediction by Partial Matching (PB-PPM), where data are placed and replaced guided by popularity information of Web accesses, thus only important and useful information is stored. PB-PPM greatly reduces the required storage space, and improves the prediction accuracy. Second, a major weakness in existing Web servers is that prefetching activities are scheduled independently of dynamically changing server workloads. Without a proper control and coordination between the two kinds of activities, prefetching can negatively affect the Web services and degrade the Web access performance. to address this problem, we have developed a queuing model to characterize the interactions. Guided by the model, we have designed a coordination scheme that dynamically adjusts the prefetching aggressiveness in Web Servers. This scheme not only prevents the Web servers from being overloaded, but it can also minimize the average server response time. Finally, we have proposed a scheme that effectively coordinates the sharing of access information for both proxy and Web servers. With the support of this scheme, the accuracy of prefetching decisions is significantly improved.;Regarding data consistency support for Internet caching and data replications, we have conducted three significant studies. First, we have developed a consistency support technique to maintain the data consistency among the replicas in structured P2P networks. Based on Pastry, an existing and popular P2P system, we have implemented this scheme, and show that it can effectively maintain consistency while prevent hot-spot and node-failure problems. Second, we have designed and implemented a DNS cache update protocol, called DNScup, to provide strong consistency for domain/IP mappings. Finally, we have developed a dynamic lease scheme to timely update the replicas in Internet

    Improving Data Delivery in Wide Area and Mobile Environments

    Get PDF
    The popularity of the Internet has dramatically increased the diversity of clients and applications that access data across wide area networks and mobile environments. Data delivery in these environments presents several challenges. First, applications often have diverse requirements with respect to the latency of their requests and recency of data. Traditional data delivery architectures do not provide interfaces to express these requirements. Second, it is difficult to accurately estimate when objects are updated. Existing solutions either require servers to notify clients (push-based), which adds overhead at servers and may not scale, or require clients to contact servers (pull-based), which rely on estimates that are often inaccurate in practice. Third, cache managers need a flexible and scalable way to determine if an object in the cache meets a client's latency and recency preferences. Finally, mobile clients who access data on wireless networks share limited wireless bandwidth and typically have different QoS requirements for different applications. In this dissertation we address these challenges using two complementary techniques, client profiles and server cooperation. Client profiles are a set of parameters that enable clients to communicate application-specific latency and recency preferences to caches and wireless base stations. Profiles are used by cache managers to determine whether to deliver a cached object to the client or to validate the object at a remote server, and for scheduling data delivery to mobile clients. Server cooperation enables servers to provide resource information to cache managers, which enables cache managers to estimate the recency of cached objects. The main contributions of this dissertation are as follows: First, we present a flexible and scalable architecture to support client profiles that is straightforward to implement at a cache. wireless base station. Second, we present techniques to improve estimates of the recency of cached objects using server cooperation by increasing the amount of information servers provide to caches. Third, for mobile clients, we present a framework for incorporating profiles into the cache utilization, downloading, and scheduling decisions at a We evaluate client profiles and server cooperation using synthetic and trace data. Finally, we present an implementation of profiles and experimental results
    corecore