449 research outputs found
A Data-driven Statistical Approach to Customer Behaviour Analysis and Modelling in Online Freemium Games
The video games industry is one of the most attractive and lucrative segments in the entertainment and digital media, with big business of more than $150 billion worldwide. A popular approach in this industry is the online freemium model, wherein the game is downloadable free of cost, while advanced and bonus content have optional charges. Monetisation is through micro payments by customers and the focus is on maintaining average revenue per user and lifetime value of players. The overall aim of this research is to develop suitable data-driven methods to gain insight about customer behaviour in online freemium games, with a view to providing recommendations for successful business in this industry.Three important aspects of user behaviour are modelled in this research - engagement, time until defection, and number of micro transactions made. A multiple logistic regression using penalised likelihood approach is found to be most suitable for modelling and demonstrates good fit and accuracy for assigning observations to engaged and non-engaged categories. Cox’s proportional hazards model is adopted to analyse time to defection, and a negative binomial zero-inflated model results in the best fit to the data on micro payments. Cluster analysis techniques are used to classify the wide variety of customers based on their gameplay styles, and social network models are developed to identify prominent ‘actors’ based on social interactions. Some of the significant predictors of engagement and monetisation are amount of premium in-game currency, success in missions and competency in virtual fights, and quantity of virtual resources used in the game.This research offers extensive insight into what drives the reputation, virality and commercial viability of freemium games. In particular it helps to fill a gap in understanding the behaviour of online game players by demonstrating the effectiveness of applying a data analytic approach. It gives more insight into the determinants of player behaviour than relying on observational studies or those based on survey research. Additionally, it refines statistical models and demonstrates their implementation in R to new and complex data types representing online customer behaviours
Digital user's decision journey
The landscape of the Internet is continually evolving. This creates huge opportunities for different industries to optimize vital channels online, resulting in various-forms of new Internet services. As a result, digital users are interacting with many digital systems and they are exhibiting dynamic behaviors. Their shopping behaviors are drastically different today than it used to be, with offline and online shopping interacting with each other. They have many channels to access online media but their consumption patterns on different channels are quite different. They do philanthropy online to help others but their heterogeneous motivations and different fundraising campaigns leads to distinct path-to-contribution. Understanding the digital user’s decision making process behind their dynamic behaviors is critical as they interact with various digital systems for the firms to improve user experience and improve their bottom line. In this thesis, I study digital users’ decision journeys and the corresponding digital technology firms’ strategies using inter-disciplinary approaches that combine econometrics, economic structural modeling and machine learning. The uncovered decision journey not only offer empirical managerial insights but also provide guideline for introducing intervention to better serve digital users
Recommended from our members
Understanding Music Semantics and User Behavior with Probabilistic Latent Variable Models
Bayesian probabilistic modeling provides a powerful framework for building flexible models to incorporate latent structures through likelihood model and prior. When we specify a model, we make certain assumptions about the underlying data-generating process with respect to these latent structures. For example, the latent Dirichlet allocation (LDA) model assumes that when generating a document, we first select a latent topic and then select a word that often appears in the selected topic. We can uncover the latent structures conditioned on the observed data via posterior inference. In this dissertation, we apply the tools of probabilistic latent variable models and try to understand complex real-world data about music semantics and user behavior.
We first look into the problem of automatic music tagging -- inferring the semantic tags (e.g., "jazz'', "piano'', "happy'', etc.) from the audio features. We treat music tagging as a matrix completion problem and apply the Poisson matrix factorization model jointly on the vector-quantized audio features and a "bag-of-tags'' representation. This approach exploits the shared latent structure between semantic tags and acoustic codewords. We present experimental results on the Million Song Dataset for both annotation and retrieval tasks, illustrating the steady improvement in performance as more data is used.
We then move to the intersection between music semantics and user behavior: music recommendation. The leading performance in music recommendation is achieved by collaborative filtering methods which exploit the similarity patterns in user's listening history. We address the fundamental cold-start problem of collaborative filtering: it cannot recommend new songs that no one has listened to. We train a neural network on semantic tagging information as a content model and use it as a prior in a collaborative filtering model. The proposed system is evaluated on the Million Song Dataset and shows comparably better result than the collaborative filtering approaches, in addition to the favorable performance in the cold-start case.
Finally, we focus on general recommender systems. We examine two different types of data: implicit and explicit feedback, and introduce the notion of user exposure (whether or not a user is exposed to an item) as part of the data-generating process, which is latent for implicit data and observed for explicit data. For implicit data, we propose a probabilistic matrix factorization model and infer the user exposure from data. In the language of causal analysis (Imbens and Rubin, 2015), user exposure has close connection to the assignment mechanism. We leverage this connection more directly for explicit data and develop a causal inference approach to recommender systems. We demonstrate that causal inference for recommender systems leads to improved generalization to new data.
Exact posterior inference is generally intractable for latent variables models. Throughout this thesis, we will design specific inference procedure to tractably analyze the large-scale data encountered under each scenario
- …