949 research outputs found
Web Page Prediction for Web Personalization: A Review
This paper proposes a survey of Web Page Ranking for web personalization. Web page prefetching has been widely used to reduce the access latency problem of the Internet. However, if most prefetched web pages are not visited by the users in their subsequent accesses, the limited network bandwidth and server resources will not be used efficiently and may worsen the access delay problem. Therefore, it is critical that we have an accurate prediction method during prefetching. The technique like Markov models have been widely used to represent and analyze user2018;s navigational behavior (usage data) in the Web graph, using the transitional probabilities between web pages, as recorded in the web logs. The recorded users2018; navigation is used to extract popular web paths and predict current users2018; next steps
Referrer Graph: A cost-effective algorithm and pruning method for predicting web accesses
This paper presents the Referrer Graph (RG) web prediction algorithm and a pruning method for the associated
graph as a low-cost solution to predict next web users accesses. RG is aimed at being used in a real
web system with prefetching capabilities without degrading its performance. The algorithm learns from
users accesses and builds a Markov model. These kinds of algorithms use the sequence of the user accesses
to make predictions. Unlike previous Markov model based proposals, the RG algorithm differentiates
dependencies in objects of the same page from objects of different pages by using the object URI and the
referrer in each request. Although its design permits us to build a simple data structure that is easier to
handle and, consequently, needs lower computational cost in comparison with other algorithms, a pruning
mechanism has been devised to avoid the continuous growing of this data structure. Results show
that, compared with the best prediction algorithms proposed in the open literature, the RG algorithm
achieves similar precision values and page latency savings but requiring much less computational and
memory resources. Furthermore, when pruning is applied, additional and notable resource consumption
savings can be achieved without degrading original performance. In order to reduce further the resource
consumption, a mechanism to prune de graph has been devised, which reduces resource consumption of
the baseline system without degrading the latency savings.
2013 Elsevier B.V. All rights reserved.This work has been partially supported by Spanish Ministry of Science and Innovation under Grant TIN2009-08201. The authors would also like to thank the technical staff of the School of Computer Science at the Polytechnic University of Valencia for providing us recent and customized trace files logged by their web server.De La Ossa Perez, BA.; Gil Salinas, JA.; Sahuquillo Borrás, J.; Pont Sanjuan, A. (2013). Referrer Graph: A cost-effective algorithm and pruning method for predicting web accesses. Computer Communications. 36(8):881-894. https://doi.org/10.1016/j.comcom.2013.02.005S88189436
Generating dynamic higher-order Markov models in web usage mining
Markov models have been widely used for modelling users’ web navigation behaviour. In previous work we have presented a dynamic clustering-based Markov model that accurately represents second-order transition probabilities given by a collection of navigation sessions. Herein, we propose a generalisation of the method that takes into account higher-order conditional probabilities. The method makes use of the state cloning concept together with a clustering technique to separate the navigation paths that reveal differences in the conditional probabilities. We report on experiments conducted with three real world data sets. The results show that some pages require a long history to understand the users choice of link, while others require only a short history. We also show that the number of additional states induced by the method can be controlled through a probability threshold parameter
Retrospective Higher-Order Markov Processes for User Trails
Users form information trails as they browse the web, checkin with a
geolocation, rate items, or consume media. A common problem is to predict what
a user might do next for the purposes of guidance, recommendation, or
prefetching. First-order and higher-order Markov chains have been widely used
methods to study such sequences of data. First-order Markov chains are easy to
estimate, but lack accuracy when history matters. Higher-order Markov chains,
in contrast, have too many parameters and suffer from overfitting the training
data. Fitting these parameters with regularization and smoothing only offers
mild improvements. In this paper we propose the retrospective higher-order
Markov process (RHOMP) as a low-parameter model for such sequences. This model
is a special case of a higher-order Markov chain where the transitions depend
retrospectively on a single history state instead of an arbitrary combination
of history states. There are two immediate computational advantages: the number
of parameters is linear in the order of the Markov chain and the model can be
fit to large state spaces. Furthermore, by providing a specific structure to
the higher-order chain, RHOMPs improve the model accuracy by efficiently
utilizing history states without risks of overfitting the data. We demonstrate
how to estimate a RHOMP from data and we demonstrate the effectiveness of our
method on various real application datasets spanning geolocation data, review
sequences, and business locations. The RHOMP model uniformly outperforms
higher-order Markov chains, Kneser-Ney regularization, and tensor
factorizations in terms of prediction accuracy
Web Site Personalization based on Link Analysis and Navigational Patterns
The continuous growth in the size and use of the World Wide Web imposes new methods of design and development of on-line information services. The need for predicting the users’ needs in order to improve the usability and user retention of a web site is more than evident and can be addressed by personalizing it. Recommendation algorithms aim at proposing “next” pages to users based on their current visit and the past users’ navigational patterns. In the vast majority of related algorithms, however, only the usage data are used to produce recommendations, disregarding the structural properties of the web graph. Thus important – in terms of PageRank authority score – pages may be underrated. In this work we present UPR, a PageRank-style algorithm which combines usage data and link analysis techniques for assigning probabilities to the web pages based on their importance in the web site’s navigational graph. We propose the application of a localized version of UPR (l-UPR) to personalized navigational sub-graphs for online web page ranking and recommendation. Moreover, we propose a hybrid probabilistic predictive model based on Markov models and link analysis for assigning prior probabilities in a hybrid probabilistic model. We prove, through experimentation, that this approach results in more objective and representative predictions than the ones produced from the pure usage-based approaches
IMPUTING OR SMOOTHING? MODELLING THE MISSING ONLINE CUSTOMER JOURNEY TRANSITIONS FOR PURCHASE PREDICTION
Online customer journeys are at the core of e-commerce systems and it is therefore important to model and understand this online customer behaviour. Clickstream data from online journeys can be modelled using Markov Chains. This study investigates two different approaches to handle missing transition probabilities in constructing Markov Chain models for purchase prediction. Imputing the transition probabilities by using Chapman-Kolmogorov (CK) equation addresses this issue and achieves high prediction accuracy by approximating them with one step ahead probability. However, it comes with the problem of a high computational burden and some probabilities remaining zero after imputation. An alternative approach is to smooth the transition probabilities using Bayesian techniques. This ensures non-zero probabilities but this approach has been criticized for not being as accurate as the CK method, though this has not been fully evaluated in the literature using realistic, commercial data. We compare the accuracy of the purchase prediction of the CK and Bayesian methods, and evaluate them based on commercial web server data from a major European airline
Predictive Analytics with Sequence-based Clustering and Markov Chain
This research proposes a predictive modeling framework for Web user behavior with Web usage mining (WUM). The proposed predictive model utilizes sequence-based clustering, in order to group Web users into clusters with similar Web browsing behavior and Markov chains, in order to model Web users’ Web navigation behavior. This research will also provide a performance evaluation framework and suggest WUM systems that can improve advertisement placement and target marketing in a Web site
REVIEW PAPER ON WEB PAGE PREDICTION USING DATA MINING
The continuous growth of the World Wide Web imposes the need of new methods of design and determines how to access a web page in the web usage mining by performing preprocessing of the data in a web page and development of on-line information services. The need for predicting the user’s needs in order to improve the usability and user retention of a web site is more than evident now a day. Without proper guidance, a visitor often wanders aimlessly without visiting important pages, loses interest, and leaves the site sooner than expected. In proposed system focus on investigating efficient and effective sequential access pattern mining techniques for web usage data. The mined patterns are then used for matching and generating web links for online recommendations. A web page of interest application will be developed for evaluating the quality and effectiveness of the discovered knowledge.  Keyword: Webpage Prediction, Web Mining, MRF, ANN, KNN, GA
- …