25,145 research outputs found

    Fuzzy Clustering in Web Mining

    Get PDF
    Web mining is the use of data mining techniques to automatically discover and extract information from web. Clustering is one of the possible techniques to improve the efficiency in information finding process. Conventional clustering classifies the given data objects into exclusive clusters. However such a partition is insufficient to represent many real situations. Hence a fuzzy clustering method is offered to construct clusters with uncertain boundaries and allows the object to belong to multiple clusters with degree of membership. Web data has fuzzy characteristics, so fuzzy clustering is better suitable for web mining in comparison with conventional clustering. In this paper, we have proposed two algorithms that are Fuzzy c-Means (FCM) and Clustering based on Fuzzy Equivalence Relations which can be used for web page mining and web usage mining. The results obtained from the proposed algorithm are more convincing. The experimental results are carried out on different algorithmic parameters on real data. The analysis is being done by comparing the proposed algorithm with conventional clustering algorithms

    WEB PAGE ACCESS PREDICTION USING FUZZY CLUSTERING BY LOCAL APPROXIMATION MEMBERSHIPS (FLAME) ALGORITHM

    Get PDF
    ABSTRACT Web page prediction is a technique of web usage mining used to predict the next set of web pages that a user may visit based on the knowledge of previously visited web pages. The World Wide Web (WWW) is a popular and interactive medium for publishing the information. While browsing the web, users are visiting many unwanted pages instead of targeted page. The web usage mining techniques are used to solve that problem by analyzing the web usage patterns for a web site. Clustering is a data mining technique used to identify similar access patterns. If mining is done on those patterns, recommendation accuracy will be improved rather than mining dissimilar access patterns. The discovered patterns can be used for better web page access prediction. Here, two different clustering techniques, namely Fuzzy C-Means (FCM) clustering and FLAME clustering algorithms has been investigated to predict the webpage that will be accessed in the future based on the previous action of browsers behavior. The Performance of FLAME clustering algorithm was found to be better than that of fuzzy C-means, fuzzy K-means algorithms and fuzzy self-organizing maps (SOM). It also improves the user browsing time without compromising prediction accuracy

    Interval set clustering of web users using modified Kohonen self-organizing maps based on the properties of rough sets

    Get PDF
    Publisher's version/PDFWeb usage mining involves application of data mining techniques to discover usage patterns from the web data. Clustering is one of the important functions in web usage mining. The likelihood of bad or incomplete web usage data is higher than the conventional applications. The clusters and associations in web usage mining do not necessarily have crisp boundaries. Researchers have studied the possibility of using fuzzy sets in web mining clustering applications. Recent attempts have adapted the K-means clustering algorithm as well as genetic algorithms based on rough sets to find interval sets of clusters. The genetic algorithms based clustering may not be able to handle large amounts of data. The K-means algorithm does not lend itself well to adaptive clustering. This paper proposes an adaptation of Kohonen self-organizing maps based on the properties of rough sets, to find the interval sets of clusters. Experiments are used to create interval set representations of clusters of web visitors on three educational web sites. The proposed approach has wider applications in other areas of web mining as well as data mining

    Cluster Optimization for Improved Web Usage Mining

    Get PDF
    Now days, World Wide Web (WWW) has become rich and most powerful source of information. Conversely, it has become tricky and critical task to retrieve actual information due to its continuous expansion in dimensions. Web Usage Mining is a step-wise technique of extracting useful access patterns of the user from web. Web personalization makes use of web usage mining techniques, for knowledge acquisition process done by analyzing the user navigational patterns. The web page personalization involves clustering of different web pages having similar navigation patterns for an individual. Since cluster size expands due to the frequent access, optimization or shrinking the size of clusters becomes a chief consideration. This paper proposes a tactic of cluster optimization based on concept of swarm intelligence techniques. Later on based on the recognition of user access patterns, clustering is implemented using neural fuzzy approach i.e. NEF Class algorithm and cluster optimization is implemented using Ant Nest Mate Approach

    Temporal mining of the web and supermarket data using fuzzy and rough set clustering

    Get PDF
    xviii, 117 leaves : ill. (some col.) ; 28 cm.Includes abstract.Includes bibliographical references (leaves 114-117).Clustering is an important aspect of data mining. Many data mining applications tend to be more amenable to non-conventional clustering techniques. In this research three clustering methods are employed to analyze the web usage and super market data sets: conventional, rough set and fuzzy methods. Interval clusters based on fuzzy memberships are also created. The web usage data were collected from three educational web sites. The supermarket data spanned twenty-six weeks of transactions from twelve stores spanning three regions. Cluster sizes obtained using the three methods are compared, and cluster characteristics are analyzed. Web users and supermarket customers tend to change their characteristics over a period of time. These changes may be temporary or permanent. This thesis also studies the changes in cluster characteristics over time. Both experiments demonstrate that the rough and fuzzy methods are more subtle and accurate in capturing the slight differences among clusters

    A Fuzzy Approach for Feature Evaluation and Dimensionality Reduction to Improve the Quality of Web Usage Mining Results

    Get PDF
    The explosive growth in the information available on the Web has necessitated the need for developing Web personalization systems that understand user preferences to dynamically serve customized content to individual users. Web server access logs contain substantial data about the accesses of users to a Web site. Hence, if properly exploited, the log data can reveal useful information about the navigational behaviour of users in a site. In order to reveal the information about user preferences from, Web Usage Mining is being performed. Web Usage Mining is the application of data mining techniques to web usage log repositories in order to discover the usage patterns that can be used to analyze the user’s navigational behavior. WUM contains three main steps: preprocessing, knowledge extraction and results analysis. During the preprocessing stage, raw web log data is transformed into a set of user profiles. Each user profile captures a set of URLs representing a user session. Clustering can be applied to this sessionized data in order to capture similar interests and trends among users’ navigational patterns. Since the sessionized data may contain thousands of user sessions and each user session may consist of hundreds of URL accesses, dimensionality reduction is achieved by eliminating the low support URLs. Very small sessions are also removed in order to filter out the noise from the data. But direct elimination of low support URLs and small sized sessions may results in loss of a significant amount of information especially when the count of low support URLs and small sessions is large. We propose a fuzzy solution to deal with this problem by assigning weights to URLs and user sessions based on a fuzzy membership function. After assigning the weights we apply a "Fuzzy c-Mean Clustering" algorithm to discover the clusters of user profiles. In this paper, we describe our fuzzy set theoretic approach to perform feature selection (or dimensionality reduction) and session weight assignment. Finally we compare our soft computing based approach of dimensionality reduction with the traditional approach of direct elimination of small sessions and low support count URLs. Our results show that fuzzy feature evaluation and dimensionality  reduction results in better performance and validity indices for the discovered clusters
    • …
    corecore