1,859 research outputs found

    Efficient Web Usage Mining Process for Sequential Patterns

    Full text link
    The tremendous growth in volume of web usage data results in the boost of web mining research with focus on discovering potentially useful knowledge from web usage data. This paper presents a new web usage mining process for finding sequential patterns in web usage data which can be used for predicting the possible next move in browsing sessions for web personalization. This process consists of three main stages: preprocessing web access sequences from the web server log, mining preprocessed web log access sequences by a tree-based algorithm, and predicting web access sequences by using a dynamic clustering-based model. It is designed based on the integration of the dynamic clustering-based Markov model with the Pre-Order Linked WAP-Tree Mining (PLWAP) algorithm to enhance mining performance. The proposed mining process is verified by experiments with promising results

    A Fuzzy Approach for Feature Evaluation and Dimensionality Reduction to Improve the Quality of Web Usage Mining Results

    Get PDF
    The explosive growth in the information available on the Web has necessitated the need for developing Web personalization systems that understand user preferences to dynamically serve customized content to individual users. Web server access logs contain substantial data about the accesses of users to a Web site. Hence, if properly exploited, the log data can reveal useful information about the navigational behaviour of users in a site. In order to reveal the information about user preferences from, Web Usage Mining is being performed. Web Usage Mining is the application of data mining techniques to web usage log repositories in order to discover the usage patterns that can be used to analyze the user’s navigational behavior. WUM contains three main steps: preprocessing, knowledge extraction and results analysis. During the preprocessing stage, raw web log data is transformed into a set of user profiles. Each user profile captures a set of URLs representing a user session. Clustering can be applied to this sessionized data in order to capture similar interests and trends among users’ navigational patterns. Since the sessionized data may contain thousands of user sessions and each user session may consist of hundreds of URL accesses, dimensionality reduction is achieved by eliminating the low support URLs. Very small sessions are also removed in order to filter out the noise from the data. But direct elimination of low support URLs and small sized sessions may results in loss of a significant amount of information especially when the count of low support URLs and small sessions is large. We propose a fuzzy solution to deal with this problem by assigning weights to URLs and user sessions based on a fuzzy membership function. After assigning the weights we apply a "Fuzzy c-Mean Clustering" algorithm to discover the clusters of user profiles. In this paper, we describe our fuzzy set theoretic approach to perform feature selection (or dimensionality reduction) and session weight assignment. Finally we compare our soft computing based approach of dimensionality reduction with the traditional approach of direct elimination of small sessions and low support count URLs. Our results show that fuzzy feature evaluation and dimensionality  reduction results in better performance and validity indices for the discovered clusters

    Mining User Interests from User Search by Using Web Log Data

    Get PDF
    Web Usage Mining (WUM) is a kind of data mining method that can be used to discover user access patterns from Web log data. A lot of work has been done already about this area and the obtained results are used in different applications such as recommending the Web usage patterns, personalization, system improvement and business intelligence. WUM includes three phases that are called preprocessing, pattern discovery and pattern analysis. There square measure totally different techniques for WUM that have their own benefits and downsides. We tend to initial describe a way for extracting a worldwide linguistics illustration of a pursuit question log then show, however, we are able to use it to semantically extract the user interests. During this paper extraction of users interest from journal knowledge will be done, that square measure supported visit time and visit density which might be get from an analysis of internet users journal knowledge

    Preprocessing and Content/Navigational Pages Identification as Premises for an Extended Web Usage Mining Model Development

    Get PDF
    From its appearance until nowadays, the internet saw a spectacular growth not only in terms of websites number and information volume, but also in terms of the number of visitors. Therefore, the need of an overall analysis regarding both the web sites and the content provided by them was required. Thus, a new branch of research was developed, namely web mining, that aims to discover useful information and knowledge, based not only on the analysis of websites and content, but also on the way in which the users interact with them. The aim of the present paper is to design a database that captures only the relevant data from logs in a way that will allow to store and manage large sets of temporal data with common tools in real time. In our work, we rely on different web sites or website sections with known architecture and we test several hypotheses from the literature in order to extend the framework to sites with unknown or chaotic structure, which are non-transparent in determining the type of visited pages. In doing this, we will start from non-proprietary, preexisting raw server logs.Knowledge Management, Web Mining, Data Preprocessing, Decision Trees, Databases

    WEB MINING IN E-COMMERCE

    Get PDF
    Recently, the web is becoming an important part of people’s life. The web is a very good place to run successful businesses. Selling products or services online plays an important role in the success of businesses that have a physical presence, like a reE-Commerce, Data mining, Web mining
    corecore