67,583 research outputs found

    WEB PAGE ACCESS PREDICTION USING FUZZY CLUSTERING BY LOCAL APPROXIMATION MEMBERSHIPS (FLAME) ALGORITHM

    Get PDF
    ABSTRACT Web page prediction is a technique of web usage mining used to predict the next set of web pages that a user may visit based on the knowledge of previously visited web pages. The World Wide Web (WWW) is a popular and interactive medium for publishing the information. While browsing the web, users are visiting many unwanted pages instead of targeted page. The web usage mining techniques are used to solve that problem by analyzing the web usage patterns for a web site. Clustering is a data mining technique used to identify similar access patterns. If mining is done on those patterns, recommendation accuracy will be improved rather than mining dissimilar access patterns. The discovered patterns can be used for better web page access prediction. Here, two different clustering techniques, namely Fuzzy C-Means (FCM) clustering and FLAME clustering algorithms has been investigated to predict the webpage that will be accessed in the future based on the previous action of browsers behavior. The Performance of FLAME clustering algorithm was found to be better than that of fuzzy C-means, fuzzy K-means algorithms and fuzzy self-organizing maps (SOM). It also improves the user browsing time without compromising prediction accuracy

    A Clustering and Associativity Analysis Based Probabilistic Method for Web Page Prediction

    Get PDF
    Today all the information, resources are available online through websites and web page. To access any instant information about any product, institution or organization, users can access the online available web pages. In this work, a three stage model is provided for more intelligent web page prediction. The method used the clustering and associativity analysis with rule formulation to improve the prediction results. The CMeans clustering is applied in this prior stage to identify the sessions with high and low usage of web pages. Once the clustering is done, the rule is defined to identify the sessions with page occurrence more than average. In the final stage, the neuro-fuzzy is applied to perform the web page prediction. The result shows that the model has provided the effective derivation on web page visits

    Rough Sets Clustering and Markov model for Web Access Prediction

    Get PDF
    Discovering user access patterns from web access log is increasing the importance of information to build up adaptive web server according to the individual user’s behavior. The variety of user behaviors on accessing information also grows, which has a great impact on the network utilization. In this paper, we present a rough set clustering to cluster web transactions from web access logs and using Markov model for next access prediction. Using this approach, users can effectively mine web log records to discover and predict access patterns. We perform experiments using real web trace logs collected from www.dusit.ac.th servers. In order to improve its prediction ration, the model includes a rough sets scheme in which search similarity measure to compute the similarity between two sequences using upper approximation

    Exploration of Wikipedia traffic data to analyze the relationship between multiple pages

    Get PDF
    Time series analysis and forecasting is an essential part of any holistic data analysis. Many prediction challenges depend on correctly assessing how the data changes over time. Several examples of time-series analyses that are traditionally seen are of univariate type, but it is hardly the case in a real-world setting. Any time series is influenced by multiple components, which includes its past and other constant or variable factors. This project is aimed at understanding multivariate time series models using the non-traditional time series analysis like clustering and sequence to sequence model using a long short-term memory architecture. The dataset on which this experiment is being applied is the Wikipedia web page traffic. The dataset contains around 145000 web pages and corresponding web page traffic from July 2015 to December 2016. Findings based on the hierarchical clustering model are presented in this study.Master of Science in Information Scienc

    A COLLABORATIVE FILTERING APPROACH TO PREDICT WEB PAGES OF INTEREST FROMNAVIGATION PATTERNS OF PAST USERS WITHIN AN ACADEMIC WEBSITE

    Get PDF
    This dissertation is a simulation study of factors and techniques involved in designing hyperlink recommender systems that recommend to users, web pages that past users with similar navigation behaviors found interesting. The methodology involves identification of pertinent factors or techniques, and for each one, addresses the following questions: (a) room for improvement; (b) better approach, if any; and (c) performance characteristics of the technique in environments that hyperlink recommender systems operate in. The following four problems are addressed:Web Page Classification. A new metric (PageRank Ă— Inverse Links-to-Word count ratio) is proposed for classifying web pages as content or navigation, to help in the discovery of user navigation behaviors from web user access logs. Results of a small user study suggest that this metric leads to desirable results.Data Mining. A new apriori algorithm for mining association rules from large databases is proposed. The new algorithm addresses the problem of scaling of the classical apriori algorithm by eliminating an expensive joinstep, and applying the apriori property to every row of the database. In this study, association rules show the correlation relationships between user navigation behaviors and web pages they find interesting. The new algorithm has better space complexity than the classical one, and better time efficiency under some conditionsand comparable time efficiency under other conditions.Prediction Models for User Interests. We demonstrate that association rules that show the correlation relationships between user navigation patterns and web pages they find interesting can be transformed intocollaborative filtering data. We investigate collaborative filtering prediction models based on two approaches for computing prediction scores: using simple averages and weighted averages. Our findings suggest that theweighted averages scheme more accurately computes predictions of user interests than the simple averages scheme does.Clustering. Clustering techniques are frequently applied in the design of personalization systems. We studied the performance of the CLARANS clustering algorithm in high dimensional space in relation to the PAM and CLARA clustering algorithms. While CLARA had the best time performance, CLARANS resulted in clusterswith the lowest intra-cluster dissimilarities, and so was most effective in this regard

    Generating dynamic higher-order Markov models in web usage mining

    Get PDF
    Markov models have been widely used for modelling users’ web navigation behaviour. In previous work we have presented a dynamic clustering-based Markov model that accurately represents second-order transition probabilities given by a collection of navigation sessions. Herein, we propose a generalisation of the method that takes into account higher-order conditional probabilities. The method makes use of the state cloning concept together with a clustering technique to separate the navigation paths that reveal differences in the conditional probabilities. We report on experiments conducted with three real world data sets. The results show that some pages require a long history to understand the users choice of link, while others require only a short history. We also show that the number of additional states induced by the method can be controlled through a probability threshold parameter
    • …
    corecore