11,974 research outputs found
Web Site Personalization based on Link Analysis and Navigational Patterns
The continuous growth in the size and use of the World Wide Web imposes new methods of design and development of on-line information services. The need for predicting the users’ needs in order to improve the usability and user retention of a web site is more than evident and can be addressed by personalizing it. Recommendation algorithms aim at proposing “next” pages to users based on their current visit and the past users’ navigational patterns. In the vast majority of related algorithms, however, only the usage data are used to produce recommendations, disregarding the structural properties of the web graph. Thus important – in terms of PageRank authority score – pages may be underrated. In this work we present UPR, a PageRank-style algorithm which combines usage data and link analysis techniques for assigning probabilities to the web pages based on their importance in the web site’s navigational graph. We propose the application of a localized version of UPR (l-UPR) to personalized navigational sub-graphs for online web page ranking and recommendation. Moreover, we propose a hybrid probabilistic predictive model based on Markov models and link analysis for assigning prior probabilities in a hybrid probabilistic model. We prove, through experimentation, that this approach results in more objective and representative predictions than the ones produced from the pure usage-based approaches
Evaluating Variable Length Markov Chain Models for Analysis of User Web Navigation Sessions
Markov models have been widely used to represent and analyse user web
navigation data. In previous work we have proposed a method to dynamically
extend the order of a Markov chain model and a complimentary method for
assessing the predictive power of such a variable length Markov chain. Herein,
we review these two methods and propose a novel method for measuring the
ability of a variable length Markov model to summarise user web navigation
sessions up to a given length. While the summarisation ability of a model is
important to enable the identification of user navigation patterns, the ability
to make predictions is important in order to foresee the next link choice of a
user after following a given trail so as, for example, to personalise a web
site. We present an extensive experimental evaluation providing strong evidence
that prediction accuracy increases linearly with summarisation ability
Rough Sets Clustering and Markov model for Web Access Prediction
Discovering user access patterns from web access log is increasing the importance of information to build up adaptive web server according to the individual user’s behavior. The variety of user behaviors on accessing information also grows, which has a great impact on the network utilization. In this paper, we present a rough set clustering to cluster web transactions from web access logs and using Markov model for next access prediction. Using this approach, users can effectively mine web log records to discover and predict access patterns. We perform experiments using real web trace logs collected from www.dusit.ac.th servers. In order to improve its prediction ration, the model includes a rough sets scheme in which search similarity measure to compute the similarity between two sequences using upper approximation
When is a Network a Network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks
We introduce a framework for the modeling of sequential data capturing
pathways of varying lengths observed in a network. Such data are important,
e.g., when studying click streams in information networks, travel patterns in
transportation systems, information cascades in social networks, biological
pathways or time-stamped social interactions. While it is common to apply graph
analytics and network analysis to such data, recent works have shown that
temporal correlations can invalidate the results of such methods. This raises a
fundamental question: when is a network abstraction of sequential data
justified? Addressing this open question, we propose a framework which combines
Markov chains of multiple, higher orders into a multi-layer graphical model
that captures temporal correlations in pathways at multiple length scales
simultaneously. We develop a model selection technique to infer the optimal
number of layers of such a model and show that it outperforms previously used
Markov order detection techniques. An application to eight real-world data sets
on pathways and temporal networks shows that it allows to infer graphical
models which capture both topological and temporal characteristics of such
data. Our work highlights fallacies of network abstractions and provides a
principled answer to the open question when they are justified. Generalizing
network representations to multi-order graphical models, it opens perspectives
for new data mining and knowledge discovery algorithms.Comment: 10 pages, 4 figures, 1 table, companion python package pathpy
available on gitHu
Retrospective Higher-Order Markov Processes for User Trails
Users form information trails as they browse the web, checkin with a
geolocation, rate items, or consume media. A common problem is to predict what
a user might do next for the purposes of guidance, recommendation, or
prefetching. First-order and higher-order Markov chains have been widely used
methods to study such sequences of data. First-order Markov chains are easy to
estimate, but lack accuracy when history matters. Higher-order Markov chains,
in contrast, have too many parameters and suffer from overfitting the training
data. Fitting these parameters with regularization and smoothing only offers
mild improvements. In this paper we propose the retrospective higher-order
Markov process (RHOMP) as a low-parameter model for such sequences. This model
is a special case of a higher-order Markov chain where the transitions depend
retrospectively on a single history state instead of an arbitrary combination
of history states. There are two immediate computational advantages: the number
of parameters is linear in the order of the Markov chain and the model can be
fit to large state spaces. Furthermore, by providing a specific structure to
the higher-order chain, RHOMPs improve the model accuracy by efficiently
utilizing history states without risks of overfitting the data. We demonstrate
how to estimate a RHOMP from data and we demonstrate the effectiveness of our
method on various real application datasets spanning geolocation data, review
sequences, and business locations. The RHOMP model uniformly outperforms
higher-order Markov chains, Kneser-Ney regularization, and tensor
factorizations in terms of prediction accuracy
- …