3 research outputs found
Data pre-processing on web server logs for generalized association rules mining algorithm
Web log file analysis began as a way for IT administrators to ensure adequate bandwidth and server capacity on their organizations website. Log file data can offer valuable insight into web site usage.It reflects actual usage in natural working condition, compared to the artificial setting of a usability lab.It represents the activity of many users, over potentially long period of time, compared to a limited number of users for an hour or two each.This paper describes the pre-processing techniques on IIS Web Server Logs ranging from the raw log file until before mining process can be performed. Since the pre-processing is tedious process, it depending on the algorithm and purposes of the applications
Log-Based Session Profiling and Online Behavioral Prediction in E-Commerce Websites
Improvements to customer experience give companies a competitive advantage, as understanding customers' behaviors allows e-commerce companies to enhance their marketing strategies by means of recommendation techniques and the customization of products and services. This is not a simple task, and it becomes more difficult when working with anonymous sessions since no historical information of the user can be applied. In this article, analysis and clustering of the clickstreams of past anonymous sessions are used to synthesize a prediction model based on a neural network. The model allows for prediction of a user's profile after a few clicks of an online anonymous session. This information can be used by the e-commerce's decision system to generate online recommendations and better adapt the offered services to the customer's profile
Recommended from our members
Integrating Network Analysis and Data Mining Techniques into Effective Framework for Web Mining and Recommendation. A Framework for Web Mining and Recommendation
The main motivation for the study described in this dissertation is to benefit from the development in technology and the huge amount of available data which can be easily captured, stored and maintained electronically. We concentrate on Web usage (i.e., log) mining and Web structure mining. Analysing Web log data will reveal valuable feedback reflecting how effective the current structure of a web site is and to help the owner of a web site in understanding the behaviour of the web site visitors. We developed a framework that integrates statistical analysis, frequent pattern mining, clustering, classification and network construction and analysis. We concentrated on the statistical data related to the visitors and how they surf and pass through the various pages of a given web site to land at some target pages. Further, the frequent pattern mining technique was used to study the relationship between the various pages constituting a given web site. Clustering is used to study the similarity of users and pages. Classification suggests a target class for a given new entity by comparing the characteristics of the new entity to those of the known classes. Network construction and analysis is also employed to identify and investigate the links between the various pages constituting a Web site by constructing a network based on the frequency of access to the Web pages such that pages get linked in the network if they are identified in the result of the frequent pattern mining process as frequently accessed together. The knowledge discovered by analysing a web site and its related data should be considered valuable for online shoppers and commercial web site owners. Benefitting from the outcome of the study, a recommendation system was developed to suggest pages to visitors based on their profiles as compared to similar profiles of other visitors. The conducted experiments using popular datasets demonstrate the applicability and effectiveness of the proposed framework for Web mining and recommendation. As a by product of the proposed method, we demonstrate how it is effective in another domain for feature reduction by concentrating on gene expression data analysis as an application with some interesting results reported in Chapter 5