Search CORE

58 research outputs found

Generating dynamic higher-order Markov models in web usage mining

Author: E. Charniak
J. Borges
J. Borges
M. Deshpande
M. Levene
M. Perkowitz
M. Spiliopoulou
R.R. Sarukkai
S. Schechter
X. Chen
X. Dongshan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Markov models have been widely used for modelling users’ web navigation behaviour. In previous work we have presented a dynamic clustering-based Markov model that accurately represents second-order transition probabilities given by a collection of navigation sessions. Herein, we propose a generalisation of the method that takes into account higher-order conditional probabilities. The method makes use of the state cloning concept together with a clustering technique to separate the navigation paths that reveal differences in the conditional probabilities. We report on experiments conducted with three real world data sets. The results show that some pages require a long history to understand the users choice of link, while others require only a short history. We also show that the number of additional states induced by the method can be controlled through a probability threshold parameter

CiteSeerX

Crossref

Birkbeck Institutional Research Online

Data Mining for Browsing Patterns in Weblog Data by Art Neural Networks

Author: Ganchev Ivan
Nachev Anatoli
Publication venue: Institute of Information Theories and Applications FOI ITHEA
Publication date: 01/01/2003
Field of study

Categorising visitors based on their interaction with a website is a key problem in Web content usage. The clickstreams generated by various users often follow distinct patterns, the knowledge of which may help in providing customised content. This paper proposes an approach to clustering weblog data, based on ART2 neural networks. Due to the characteristics of the ART2 neural network model, the proposed approach can be used for unsupervised and self-learning data mining, which makes it adaptable to dynamically changing websites

Bulgarian Digital Mathematics Library at IMI-BAS

Implicit Measures of Lostness and Success in Web Navigation

Author: Gwizdka Jacek
Spence Ian
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

In two studies, we investigated the ability of a variety of structural and temporal measures computed from a web navigation path to predict lostness and task success. The user’s task was to find requested target information on specified websites. The web navigation measures were based on counts of visits to web pages and other statistical properties of the web usage graph (such as compactness, stratum, and similarity to the optimal path). Subjective lostness was best predicted by similarity to the optimal path and time on task. The best overall predictor of success on individual tasks was similarity to the optimal path, but other predictors were sometimes superior depending on the particular web navigation task. These measures can be used to diagnose user navigational problems and to help identify problems in website design

E-LIS

Web-log mining for predictive web caching

Author: H.H. Zhang
Qiang Yang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Log Pre-Processing and Grammatical Inference for Web Usage Mining

Author: Murgue Thierry
Publication venue: HAL CCSD
Publication date: 01/01/2005
Field of study

International audienceIn this paper, we propose a Web Usage Mining pre-processing method to retrieve missing data from the server log files. Moreover, we propose two levels of evaluation: directly on reconstructed data, but also after a machine learning step by evaluating inferred grammatical models. We conducted some experiments and we showed that our algorithm improves the quality of user data

HAL-UJM

HAL-EMSE

IMPUTING OR SMOOTHING? MODELLING THE MISSING ONLINE CUSTOMER JOURNEY TRANSITIONS FOR PURCHASE PREDICTION

Author: Argyris‬ ‪Nikolaos
Holland Christopher P
Lin Jihao
Publication venue: AIS Electronic Library (AISeL)
Publication date: 18/06/2022
Field of study

Online customer journeys are at the core of e-commerce systems and it is therefore important to model and understand this online customer behaviour. Clickstream data from online journeys can be modelled using Markov Chains. This study investigates two different approaches to handle missing transition probabilities in constructing Markov Chain models for purchase prediction. Imputing the transition probabilities by using Chapman-Kolmogorov (CK) equation addresses this issue and achieves high prediction accuracy by approximating them with one step ahead probability. However, it comes with the problem of a high computational burden and some probabilities remaining zero after imputation. An alternative approach is to smooth the transition probabilities using Bayesian techniques. This ensures non-zero probabilities but this approach has been criticized for not being as accurate as the CK method, though this has not been fully evaluated in the literature using realistic, commercial data. We compare the accuracy of the purchase prediction of the CK and Bayesian methods, and evaluate them based on commercial web server data from a major European airline

AIS Electronic Library (AISeL)

An average linear time algorithm for web data mining

Author: JOSÉ BORGES
Kemeny J. G.
MARK LEVENE
Srivastava J.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2004
Field of study

In this paper, we study the complexity of a data mining algorithm for extracting patterns from user web navigation data that was proposed in previous work.3 The user web navigation sessions are inferred from log data and modeled as a Markov chain. The chain's higher probability trails correspond to the preferred trails on the web site. The algorithm implements a depth-first search that scans the Markov chain for the high probability trails. We show that the average behaviour of the algorithm is linear time in the number of web pages accessed

CiteSeerX

Crossref

Birkbeck Institutional Research Online

Cost and Response Time Simulation for Web-based Applications on Mobile Channels

Author: Book Matthias
Gruhn Volker
Hülder Malte
Köhler André
Publication venue
Publication date: 12/11/2018
Field of study

When considering the addition of a mobile presentation channel to an existing web-based application, a key question that has to be answered even before development begins is how the mobile channel’s characteristics will impact the user experience and the cost of using the application. If either of these factors is outside acceptable limits, economical considerations may forbid adding the channels, even if it would be feasible from a purely technical perspective. Both of these factors depend considerably on two metrics: The time required to transmit data over the mobile network, and the volume transmitted. The PETTICOAT method presented in this paper uses the dialog flow model and web server log files of an existing application to identify typical interaction sequences and to compile volume statistics, which are then run through a tool that simulates the volume and time that would be incurred by executing the interaction sequences on a mobile channel. From the simulated volume and time data, we can then calculate the cost of accessing the application on a mobile channel

Qucosa - Publikationsserver der Universität Leipzig

Automatic tracking and control for web recommendation New approaches for web recommendation

Author: Boyer Anne
Nowakowski Samuel
Publication venue: IARIA
Publication date: 01/01/2013
Field of study

International audienceRecommender systems provide users with pertinent resources according to their context and their profiles, by applying statistical and knowledge discovery techniques. This paper describes a new approach of generating suitable recommendations based on the active user's navigation stream, by considering long distance resources in the history. Our main idea to solve this problem is the following: we consider that users browsing web pages or web contents can be seen as objects moving along trajectories in the web space. Having this assumption, we derive the appropriate description of the so-called recommender space to propose a mathematical model describing the behavior of the users/targets in the web/along the trajectories inside the recommender space. The second main assumption can then be expressed as follow: if we are able to track the users/targets along their trajectories, we are able to predict the future positions in the sub-spaces of the recommender space i.e., we are able to derive a new method for web recommendation and behavior monitoring. To achieve these objectives, we use the theory of the dynamic state estimation and more specifically the theory of Kalman filtering. We establish the appropriate model of the target tracker and we derive the iterative formulation of the filter. Then, we propose a new recommender system formulated as a control loop. We validate our approach on data extracted from online video consumption and we derive a users monitoring approach. Conclusions and perspectives are derived from the analysis of the obtained results and focus on the formulation of a topology of the recommender space

CiteSeerX

INRIA a CCSD electronic archive server