628 research outputs found

    Techniques of data prefetching, replication, and consistency in the Internet

    Get PDF
    Internet has become a major infrastructure for information sharing in our daily life, and indispensable to critical and large applications in industry, government, business, and education. Internet bandwidth (or the network speed to transfer data) has been dramatically increased, however, the latency time (or the delay to physically access data) has been reduced in a much slower pace. The rich bandwidth and lagging latency can be effectively coped with in Internet systems by three data management techniques: caching, replication, and prefetching. The focus of this dissertation is to address the latency problem in Internet by utilizing the rich bandwidth and large storage capacity for efficiently prefetching data to significantly improve the Web content caching performance, by proposing and implementing scalable data consistency maintenance methods to handle Internet Web address caching in distributed name systems (DNS), and to handle massive data replications in peer-to-peer systems. While the DNS service is critical in Internet, peer-to-peer data sharing is being accepted as an important activity in Internet.;We have made three contributions in developing prefetching techniques. First, we have proposed an efficient data structure for maintaining Web access information, called popularity-based Prediction by Partial Matching (PB-PPM), where data are placed and replaced guided by popularity information of Web accesses, thus only important and useful information is stored. PB-PPM greatly reduces the required storage space, and improves the prediction accuracy. Second, a major weakness in existing Web servers is that prefetching activities are scheduled independently of dynamically changing server workloads. Without a proper control and coordination between the two kinds of activities, prefetching can negatively affect the Web services and degrade the Web access performance. to address this problem, we have developed a queuing model to characterize the interactions. Guided by the model, we have designed a coordination scheme that dynamically adjusts the prefetching aggressiveness in Web Servers. This scheme not only prevents the Web servers from being overloaded, but it can also minimize the average server response time. Finally, we have proposed a scheme that effectively coordinates the sharing of access information for both proxy and Web servers. With the support of this scheme, the accuracy of prefetching decisions is significantly improved.;Regarding data consistency support for Internet caching and data replications, we have conducted three significant studies. First, we have developed a consistency support technique to maintain the data consistency among the replicas in structured P2P networks. Based on Pastry, an existing and popular P2P system, we have implemented this scheme, and show that it can effectively maintain consistency while prevent hot-spot and node-failure problems. Second, we have designed and implemented a DNS cache update protocol, called DNScup, to provide strong consistency for domain/IP mappings. Finally, we have developed a dynamic lease scheme to timely update the replicas in Internet

    Rough Sets Clustering and Markov model for Web Access Prediction

    Get PDF
    Discovering user access patterns from web access log is increasing the importance of information to build up adaptive web server according to the individual user’s behavior. The variety of user behaviors on accessing information also grows, which has a great impact on the network utilization. In this paper, we present a rough set clustering to cluster web transactions from web access logs and using Markov model for next access prediction. Using this approach, users can effectively mine web log records to discover and predict access patterns. We perform experiments using real web trace logs collected from www.dusit.ac.th servers. In order to improve its prediction ration, the model includes a rough sets scheme in which search similarity measure to compute the similarity between two sequences using upper approximation

    A taxonomy of web prediction algorithms

    Full text link
    Web prefetching techniques are an attractive solution to reduce the user-perceived latency. These techniques are driven by a prediction engine or algorithm that guesses following actions of web users. A large amount of prediction algorithms has been proposed since the first prefetching approach was published, although it is only over the last two or three years when they have begun to be successfully implemented in commercial products. These algorithms can be implemented in any element of the web architecture and can use a wide variety of information as input. This affects their structure, data system, computational resources and accuracy. The knowledge of the input information and the understanding of how it can be handled to make predictions can help to improve the design of current prediction engines, and consequently prefetching techniques. This paper analyzes fifty of the most relevant algorithms proposed along 15 years of prefetching research and proposes a taxonomy where the algorithms are classified according to the input data they use. For each group, the main advantages and shortcomings are highlighted. © 2012 Elsevier Ltd. All rights reserved.This work has been partially supported by Spanish Ministry of Science and Innovation under Grant TIN2009-08201, Generalitat Valenciana under Grant GV/2011/002 and Universitat Politecnica de Valencia under Grant PAID-06-10/2424.Domenech, J.; De La Ossa Perez, BA.; Sahuquillo Borrás, J.; Gil Salinas, JA.; Pont Sanjuan, A. (2012). A taxonomy of web prediction algorithms. Expert Systems with Applications. 39(9):8496-8502. https://doi.org/10.1016/j.eswa.2012.01.140S8496850239

    Efficient Proactive Caching for Supporting Seamless Mobility

    Full text link
    We present a distributed proactive caching approach that exploits user mobility information to decide where to proactively cache data to support seamless mobility, while efficiently utilizing cache storage using a congestion pricing scheme. The proposed approach is applicable to the case where objects have different sizes and to a two-level cache hierarchy, for both of which the proactive caching problem is hard. Additionally, our modeling framework considers the case where the delay is independent of the requested data object size and the case where the delay is a function of the object size. Our evaluation results show how various system parameters influence the delay gains of the proposed approach, which achieves robust and good performance relative to an oracle and an optimal scheme for a flat cache structure.Comment: 10 pages, 9 figure
    corecore