3,667 research outputs found
A Recurrent Neural Network Survival Model: Predicting Web User Return Time
The size of a website's active user base directly affects its value. Thus, it
is important to monitor and influence a user's likelihood to return to a site.
Essential to this is predicting when a user will return. Current state of the
art approaches to solve this problem come in two flavors: (1) Recurrent Neural
Network (RNN) based solutions and (2) survival analysis methods. We observe
that both techniques are severely limited when applied to this problem.
Survival models can only incorporate aggregate representations of users instead
of automatically learning a representation directly from a raw time series of
user actions. RNNs can automatically learn features, but can not be directly
trained with examples of non-returning users who have no target value for their
return time. We develop a novel RNN survival model that removes the limitations
of the state of the art methods. We demonstrate that this model can
successfully be applied to return time prediction on a large e-commerce dataset
with a superior ability to discriminate between returning and non-returning
users than either method applied in isolation.Comment: Accepted into ECML PKDD 2018; 8 figures and 1 tabl
Deep Landscape Forecasting for Real-time Bidding Advertising
The emergence of real-time auction in online advertising has drawn huge
attention of modeling the market competition, i.e., bid landscape forecasting.
The problem is formulated as to forecast the probability distribution of market
price for each ad auction. With the consideration of the censorship issue which
is caused by the second-price auction mechanism, many researchers have devoted
their efforts on bid landscape forecasting by incorporating survival analysis
from medical research field. However, most existing solutions mainly focus on
either counting-based statistics of the segmented sample clusters, or learning
a parameterized model based on some heuristic assumptions of distribution
forms. Moreover, they neither consider the sequential patterns of the feature
over the price space. In order to capture more sophisticated yet flexible
patterns at fine-grained level of the data, we propose a Deep Landscape
Forecasting (DLF) model which combines deep learning for probability
distribution forecasting and survival analysis for censorship handling.
Specifically, we utilize a recurrent neural network to flexibly model the
conditional winning probability w.r.t. each bid price. Then we conduct the bid
landscape forecasting through probability chain rule with strict mathematical
derivations. And, in an end-to-end manner, we optimize the model by minimizing
two negative likelihood losses with comprehensive motivations. Without any
specific assumption for the distribution form of bid landscape, our model shows
great advantages over previous works on fitting various sophisticated market
price distributions. In the experiments over two large-scale real-world
datasets, our model significantly outperforms the state-of-the-art solutions
under various metrics.Comment: KDD 2019. The reproducible code and dataset link is
https://github.com/rk2900/DL
Learning Rich Geographical Representations: Predicting Colorectal Cancer Survival in the State of Iowa
Neural networks are capable of learning rich, nonlinear feature
representations shown to be beneficial in many predictive tasks. In this work,
we use these models to explore the use of geographical features in predicting
colorectal cancer survival curves for patients in the state of Iowa, spanning
the years 1989 to 2012. Specifically, we compare model performance using a
newly defined metric -- area between the curves (ABC) -- to assess (a) whether
survival curves can be reasonably predicted for colorectal cancer patients in
the state of Iowa, (b) whether geographical features improve predictive
performance, and (c) whether a simple binary representation or richer, spectral
clustering-based representation perform better. Our findings suggest that
survival curves can be reasonably estimated on average, with predictive
performance deviating at the five-year survival mark. We also find that
geographical features improve predictive performance, and that the best
performance is obtained using richer, spectral analysis-elicited features.Comment: 8 page
- …