6 research outputs found

    Web usage mining. Structuring semantically enriched clickstream data

    No full text
    Web servers worldwide generate a vast amount of information on web users ’ browsing activities. Several researchers have studied these so-called clickstream or web access log data to better understand and characterize web users. Clickstream data can be enriched with information about the content of visited pages and the origin (e.g., geographic, organizational) of the requests. The goal of this project is to analyse user behaviour by mining enriched web access log data. We discuss techniques and processes required for preparing, structuring and enriching web access logs. Furthermore we present several web usage mining methods for extracting useful features. Finally we employ all these techniques to cluster the users of the domain www.cs.vu.nl and to study their behaviours comprehensively. The contributions of this thesis are a data enrichment that is content and origin based and a treelike visualization of frequent navigational sequences. This visualization allows for an easily interpretable tree-like view of patterns with highlighted relevant information. The results of this project can be applied on diverse purposes, including marketing, web conten

    Early Detection of User Exits from Clickstream Data: A Markov Modulated Marked Point Process Model

    No full text
    Most users leave e-commerce websites with no purchase. Hence, it is important for website owners to detect users at risk of exiting and intervene early (e. g., adapting website content or offering price promotions). Prior approaches make widespread use of clickstream data; however, state-of-the-art algorithms only model the sequence of web pages visited and not the time spent on them. In this paper, we develop a novel Markov modulated marked point process (M3PP) model for detecting users at risk of exiting with no purchase from clickstream data. It accommodates clickstream data in a holistic manner: our proposed M3PP models both the sequence of pages visited and the temporal dynamics between them, i. e., the time spent on pages. This is achieved by a continuoustime marked point process. Different from previous Markovian clickstream models, our M3PP is the first model in which the continuous nature of time is considered. The marked point process is modulated by a continuous-time Markov process in order to account for different latent shopping phases. As a secondary contribution, we suggest a risk assessment framework. Rather than predicting future page visits, we compute a user’s risk of exiting with no purchase. For this purpose, we build upon sequential hypothesis testing in order to suggest a risk score for user exits. Our computational experiments draw upon real-world clickstream data provided by a large online retailer. Based on this, we find that state-of-the-art algorithms are consistently outperformed by our M3PP model in terms of both AUROC (+6.24 percentage points) and so-called time of early warning (+12.93 %). Accordingly, our M3PP model allows for timely detections of user exits and thus provides sufficient time for e-commerce website owners to trigger dynamic online interventions
    corecore