113,848 research outputs found

    Applications of concurrent access patterns in web usage mining

    Get PDF
    This paper builds on the original data mining and modelling research which has proposed the discovery of novel structural relation patterns, applying the approach in web usage mining. The focus of attention here is on concurrent access patterns (CAP), where an overarching framework illuminates the methodology for web access patterns post-processing. Data pre-processing, pattern discovery and patterns analysis all proceed in association with access patterns mining, CAP mining and CAP modelling. Pruning and selection of access pat-terns takes place as necessary, allowing further CAP mining and modelling to be pursued in the search for the most interesting concurrent access patterns. It is shown that higher level CAPs can be modelled in a way which brings greater structure to bear on the process of knowledge discovery. Experiments with real-world datasets highlight the applicability of the approach in web navigation

    Evaluating Variable Length Markov Chain Models for Analysis of User Web Navigation Sessions

    Full text link
    Markov models have been widely used to represent and analyse user web navigation data. In previous work we have proposed a method to dynamically extend the order of a Markov chain model and a complimentary method for assessing the predictive power of such a variable length Markov chain. Herein, we review these two methods and propose a novel method for measuring the ability of a variable length Markov model to summarise user web navigation sessions up to a given length. While the summarisation ability of a model is important to enable the identification of user navigation patterns, the ability to make predictions is important in order to foresee the next link choice of a user after following a given trail so as, for example, to personalise a web site. We present an extensive experimental evaluation providing strong evidence that prediction accuracy increases linearly with summarisation ability

    A fine grained heuristic to capture web navigation patterns

    Get PDF
    In previous work we have proposed a statistical model to capture the user behaviour when browsing the web. The user navigation information obtained from web logs is modelled as a hypertext probabilistic grammar (HPG) which is within the class of regular probabilistic grammars. The set of highest probability strings generated by the grammar corresponds to the user preferred navigation trails. We have previously conducted experiments with a Breadth-First Search algorithm (BFS) to perform the exhaustive computation of all the strings with probability above a specified cut-point, which we call the rules. Although the algorithm’s running time varies linearly with the number of grammar states, it has the drawbacks of returning a large number of rules when the cut-point is small and a small set of very short rules when the cut-point is high. In this work, we present a new heuristic that implements an iterative deepening search wherein the set of rules is incrementally augmented by first exploring trails with high probability. A stopping parameter is provided which measures the distance between the current rule-set and its corresponding maximal set obtained by the BFS algorithm. When the stopping parameter takes the value zero the heuristic corresponds to the BFS algorithm and as the parameter takes values closer to one the number of rules obtained decreases accordingly. Experiments were conducted with both real and synthetic data and the results show that for a given cut-point the number of rules induced increases smoothly with the decrease of the stopping criterion. Therefore, by setting the value of the stopping criterion the analyst can determine the number and quality of rules to be induced; the quality of a rule is measured by both its length and probability

    Prediction of users’ future requests using neural network

    Get PDF
    With the rapid growth of the World Wide Web, finding useful information from the Internet has become a critical issue. Automatic classification of user navigation patterns provides a useful tool to solve these problems. In this paper, we propose an approach for classification of users’ navigation patterns and prediction of users’ future requests. Users’ profiles are constructed based on Web log server files and one of clustering methods is implemented to users’ profiles for assigning navigation patterns. Finally, using neural network, recommender engine produces a relevant recommendation list of web pages to the active user. The preliminary results indicate that the proposed approach has high accuracy and coverage in prediction of users’ future requests

    A Web-Based Recommendation System To Predict User Movements Through Web Usage Mining

    Get PDF
    Web usage mining has become the subject of exhaustive research, as its potential for Web based personalized services, prediction user near future intentions, adaptive Web sites and customer profiling is recognized. Recently, a variety of the recommendation systems to predict user future movements through web usage mining have been proposed. However, the quality of the recommendations in the current systems to predict users‘ future requests can not still satisfy users in the particular web sites. The accuracy of prediction in a recommendation system is a main factor which is measured as quality of the system. The latest contribution in this area achieves about 50% for the accuracy of the recommendations. To provide online prediction effectively, this study has developed a Web based recommendation system to Predict User Movements, named as WebPUM, for online prediction through web usage mining system and proposed a novel approach for classifying user navigation patterns to predict users‘ future intentions. There are two main phases in WebPUM; offline phase and online phase. The approach in the offline phase is based on the new graph partitioning algorithm to model user navigation patterns for the navigation patterns mining. In this phase, an undirected graph based on the Web pages as graph vertices and degree of connectivity between web pages as weight of the graph is created by proposing new formula for weight of the each edge in the graph. Moreover, navigation pattern mining has been done by finding connected components in the graph. In the online phase, the longest common subsequence algorithm is used as a new approach in recommendation system for classifying current user activities to predict user next movements. The longest common subsequence is a well-known string matching algorithm that we have utilized to find the most similar pattern between a set of navigation patterns and current user activities for creating the recommendations

    Analysis of web visit histories, part I: Distance-based visualization of sequence rules

    Get PDF
    This paper constitutes Part I of the contribution to the analysis of web visit histories through a new methodological framework. Firstly, web usage and web structure mining are considered as an unique mining process to detect the latent structure of the web navigation across the web sections of a single portal. We extend association rules theory to web data defining new concepts of web (patterns) association and preference matrices, as well as of (indirect and direct) sequence rules. We identify the most significant rules, according to a multiple testing procedure. In the literature, web usage patterns can be visualized in no-distance-based graphs describing the navigation behavior across web pages with sequential arrows. In the following, we introduce a geometrical visualization of sequence rules at any click of the web navigation. In particular, we provide two distance-based visualization methods for the static analysis of all data tout court and the dynamic analysis to discover the most significant web paths click by click. A real world case study is considered throughout the methodological description

    WebPUM : a web-based recommendation system to predict user future movements.

    Get PDF
    Web usage mining has become the subject of exhaustive research, as its potential for Web-based personalized services, prediction of user near future intentions, adaptive Web sites, and customer profiling are recognized. Recently, a variety of recommendation systems to predict user future movements through Web usage mining have been proposed. However, the quality of recommendations in the current systems to predict user future requests in a particular Web site is below satisfaction. To effectively provide online prediction, we have developed a recommendation system called WebPUM, an action using Web usage mining system and propose a novel approach online prediction for classifying user navigation patterns to predict users’ future intentions. The approach is based on the new graph partitioning algorithm to model user navigation patterns for the navigation patterns mining phase. Furthermore, longest common subsequence algorithm is used for classifying current user activities to predict user next movement. The proposed system has been tested on CTI and MSNBC datasets. The results show an improvement in the quality of recommendations. Furthermore, experiments on scalability prove that the size of dataset and the number of the users in dataset do not significantly contribute to the percentage of accuracy

    Analisis dan Implementasi Web Usage Mining Menggunakan Algoritma Graph Partitioning (Studi Kasus : Tuneeca Online Store)

    Get PDF
    ABSTRAKSI: Peningkatan aktivitas kunjungan terhadap website menghasilkan data yang cukup banyak mengenai user dan interaksinya dengan website yang disimpan dalam web server log. Informasi yang bisa diperoleh salah satunya adalah pola navigasi user. Pola navigasi user menggambarkan aktivitas apa saja yang dilakukan user selama mengakses suatu website. Memahami pola navigasi user dalam mengakses suatu website dapat berguna untuk memahami tingkah laku user dalam mengakses websitetersebut. Sehingga dapat digunakan sebagai acuan dalam perbaikan kualitas website dan menjamin kepuasan user ketika menggunakan website tersebut.Pada ranah e-commerce, pola navigasi user dapat digunakan sebagai acuan untuk menentukan strategi bisnis berdasarkan tingkah laku user yang diperoleh. Dalam tugas akhir ini, web server logdari tuneeca online storeakan diproses dengan mengimplementasikan salah satu metode dalamweb usage mining yaitu clustering.Web usage mining merupakan salah satu pengaplikasian teknik data mining yang dapat digunakan untuk menemukan pola navigasi user. Data log tersebut akan melalui tahap preprocessing, kemudian dilakukan clustering terhadap page dengan menggunakan algoritma graph partitioning. Hasil penelitian menunjukkan bahwa penentuan parameter nilai minimum bobot mempengaruhi jumlah klaster yang dihasilkan serta nilai visit coherence yang diperoleh. Performansi dari algoritma graph partitioning cukup baik dalam membentuk klaster pola navigasi berdasarkan tingginya nilai modularization qualityyang diperoleh. Pola navigasi user yang dihasilkan dapat digunakan sebagai acuan untuk rekomendasi pengembangan web dari tuneeca online store.KATA KUNCI: web usage mining, pola navigasi user,web server log, graph partitioning, visit coherence, modularization qualityABSTRACT: Increased activity of a visit to the website generates huge enough data about users and their interaction with a website that is stored in the web server logs. One of the information that can be obtained is user navigation patterns. User navigation patternsgenerated, could give an overview about what users actually do and need when access the website. Understanding the user navigation patternscan be useful for understanding user behavior in accessing the website. So it can be used as a reference in improving the quality of the website and ensure user satisfaction when using the website. In the domain of e-commerce, user navigation patterns can be used as a reference for determining a business strategy based on user behavior is obtained. In this final project, the web server logs of tuneeca online storewill be processed by implementing clustering, one of web usage mining methods.Web usage mining is one of the application of data mining techniques that can be used to discover the user navigation patterns. The log data will be going through the preprocessing stage, then performed clustering to the pages by using graph partitioning algorithm. The result shows that determining the minimum weight value affects the number of clusters produced and the visit coherence value obtained. Performance of graph partitioning algorithm is quite good in forming clusters of navigation patterns based on high value of modularization quality obtained. User navigation patterns generated can be used as a reference for the recommendation of web development Tuneeca online store.KEYWORD: web usage mining, user navigation patterns, web server log, graph partitioning, visit coherence, modularization qualit

    Zipf's Law for web surfers

    Get PDF
    One of the main activities of Web users, known as 'surfing', is to follow links. Lengthy navigation often leads to disorientation when users lose track of the context in which they are navigating and are unsure how to proceed in terms of the goal of their original query. Studying navigation patterns of Web users is thus important, since it can lead us to a better understanding of the problems users face when they are surfing. We derive Zipf's rank frequency law (i.e., an inverse power law) from an absorbing Markov chain model of surfers' behavior assuming that less probable navigation trails are, on average, longer than more probable ones. In our model the probability of a trail is interpreted as the relevance (or 'value') of the trail. We apply our model to two scenarios: in the first the probability of a user terminating the navigation session is independent of the number of links he has followed so far, and in the second the probability of a user terminating the navigation session increases by a constant each time the user follows a link. We analyze these scenarios using two sets of experimental data sets showing that, although the first scenario is only a rough approximation of surfers' behavior, the data is consistent with the second scenario and can thus provide an explanation of surfers' behavior