Search CORE

1,699 research outputs found

Integrating E-Commerce and Data Mining: Architecture and Challenges

Author: Ansari Suhail
Kohavi Ron
Mason Llew
Zheng Zijian
Publication venue
Publication date: 01/01/2000
Field of study

We show that the e-commerce domain can provide all the right ingredients for successful data mining and claim that it is a killer domain for data mining. We describe an integrated architecture, based on our expe-rience at Blue Martini Software, for supporting this integration. The architecture can dramatically reduce the pre-processing, cleaning, and data understanding effort often documented to take 80% of the time in knowledge discovery projects. We emphasize the need for data collection at the application server layer (not the web server) in order to support logging of data and metadata that is essential to the discovery process. We describe the data transformation bridges required from the transaction processing systems and customer event streams (e.g., clickstreams) to the data warehouse. We detail the mining workbench, which needs to provide multiple views of the data through reporting, data mining algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200

arXiv.org e-Print Archive

CiteSeerX

Analysis & Visualization of EHR Patient Portal Clickstream Data

Author: Garber Lawrence
Johnson Sharon
Mushtaq Farhan
Strong Diane
Trudel John
Tulu Bengisu
Publication venue: AIS Electronic Library (AISeL)
Publication date: 26/06/2015
Field of study

The purpose of this paper is the analysis of EHR clickstream data of patient portal to determine patient usage behavior. We present our analysis of patterns found in patient clickstream data. Using directed and undirected data mining approach, data can be explored to examine whether different patient groups appear to use the portal differently. We examine changes in usage over time, and also explore difference in usage, average number of clicks per session and time spent per page based on age and gender. We then use clustering to create groups that discriminate patients by their portal usage behavior. Knowledge of these usage patterns can help service providers understand the demographics and behavioral aspects of their patients, which in turn can help them develop, enhance and improve their systems to make the best use of these portals

AIS Electronic Library (AISeL)

The Metabolism and Growth of Web Forums

Author: Wu Lingfei
Zhang Jiang
Zhao Min
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 28/08/2013
Field of study

We view web forums as virtual living organisms feeding on user's attention and investigate how these organisms grow at the expense of collective attention. We find that the "body mass" (

PV

) and "energy consumption" (

UV

) of the studied forums exhibits the allometric growth property, i.e.,

PV_t \sim UV_t ^ \theta

. This implies that within a forum, the network transporting attention flow between threads has a structure invariant of time, despite of the continuously changing of the nodes (threads) and edges (clickstreams). The observed time-invariant topology allows us to explain the dynamics of networks by the behavior of threads. In particular, we describe the clickstream dissipation on threads using the function

D_i \sim T_i ^ \gamma

, in which

T_i

is the clickstreams to node

i

and

D_i

is the clickstream dissipated from

i

. It turns out that

\gamma

, an indicator for dissipation efficiency, is negatively correlated with

\theta

and

1/\gamma

sets the lower boundary for

\theta

. Our findings have practical consequences. For example,

\theta

can be used as a measure of the "stickiness" of forums, because it quantifies the stable ability of forums to convert

UV

into

PV

, i.e., to remain users "lock-in" the forum. Meanwhile, the correlation between

\gamma

and

\theta

provides a convenient method to evaluate the `stickiness" of forums. Finally, we discuss an optimized "body mass" of forums at around

10^5

that minimizes

\gamma

and maximizes

\theta

.Comment: 6 figure

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

PubMed Central

FigShare

Customer purchase behavior prediction in E-commerce: a conceptual framework and research agenda

Author: Bezbradica Marija
Cirqueira Douglas
Helfert Markus
Hofer Markus
Nedbal Dietmar
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 14/05/2020
Field of study

Digital retailers are experiencing an increasing number of transactions coming from their consumers online, a consequence of the convenience in buying goods via E-commerce platforms. Such interactions compose complex behavioral patterns which can be analyzed through predictive analytics to enable businesses to understand consumer needs. In this abundance of big data and possible tools to analyze them, a systematic review of the literature is missing. Therefore, this paper presents a systematic literature review of recent research dealing with customer purchase prediction in the E-commerce context. The main contributions are a novel analytical framework and a research agenda in the field. The framework reveals three main tasks in this review, namely, the prediction of customer intents, buying sessions, and purchase decisions. Those are followed by their employed predictive methodologies and are analyzed from three perspectives. Finally, the research agenda provides major existing issues for further research in the field of purchase behavior prediction online

DCU Online Research Access Service

Binary Particle Swarm Optimization based Biclustering of Web usage Data

Author: Bagyamani J.
Rathipriya R.
Thangavel K.
Publication venue: 'Foundation of Computer Science'
Publication date: 30/09/2011
Field of study

Web mining is the nontrivial process to discover valid, novel, potentially useful knowledge from web data using the data mining techniques or methods. It may give information that is useful for improving the services offered by web portals and information access and retrieval tools. With the rapid development of biclustering, more researchers have applied the biclustering technique to different fields in recent years. When biclustering approach is applied to the web usage data it automatically captures the hidden browsing patterns from it in the form of biclusters. In this work, swarm intelligent technique is combined with biclustering approach to propose an algorithm called Binary Particle Swarm Optimization (BPSO) based Biclustering for Web Usage Data. The main objective of this algorithm is to retrieve the global optimal bicluster from the web usage data. These biclusters contain relationships between web users and web pages which are useful for the E-Commerce applications like web advertising and marketing. Experiments are conducted on real dataset to prove the efficiency of the proposed algorithms

arXiv.org e-Print Archive

Crossref

Dropout Model Evaluation in MOOCs

Author: Brooks Christopher
Gardner Josh
Publication venue
Publication date: 16/02/2018
Field of study

The field of learning analytics needs to adopt a more rigorous approach for predictive model evaluation that matches the complex practice of model-building. In this work, we present a procedure to statistically test hypotheses about model performance which goes beyond the state-of-the-practice in the community to analyze both algorithms and feature extraction methods from raw data. We apply this method to a series of algorithms and feature sets derived from a large sample of Massive Open Online Courses (MOOCs). While a complete comparison of all potential modeling approaches is beyond the scope of this paper, we show that this approach reveals a large gap in dropout prediction performance between forum-, assignment-, and clickstream-based feature extraction methods, where the latter is significantly better than the former two, which are in turn indistinguishable from one another. This work has methodological implications for evaluating predictive or AI-based models of student success, and practical implications for the design and targeting of at-risk student models and interventions

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications