103,370 research outputs found
A Recurrent Neural Network Survival Model: Predicting Web User Return Time
The size of a website's active user base directly affects its value. Thus, it
is important to monitor and influence a user's likelihood to return to a site.
Essential to this is predicting when a user will return. Current state of the
art approaches to solve this problem come in two flavors: (1) Recurrent Neural
Network (RNN) based solutions and (2) survival analysis methods. We observe
that both techniques are severely limited when applied to this problem.
Survival models can only incorporate aggregate representations of users instead
of automatically learning a representation directly from a raw time series of
user actions. RNNs can automatically learn features, but can not be directly
trained with examples of non-returning users who have no target value for their
return time. We develop a novel RNN survival model that removes the limitations
of the state of the art methods. We demonstrate that this model can
successfully be applied to return time prediction on a large e-commerce dataset
with a superior ability to discriminate between returning and non-returning
users than either method applied in isolation.Comment: Accepted into ECML PKDD 2018; 8 figures and 1 tabl
Biologically Interpretable, Integrative Deep Learning for Cancer Survival Analysis
Identifying complex biological processes associated to patients\u27 survival time at the cellular and molecular level is critical not only for developing new treatments for patients but also for accurate survival prediction. However, highly nonlinear and high-dimension, low-sample size (HDLSS) data cause computational challenges in survival analysis. We developed a novel family of pathway-based, sparse deep neural networks (PASNet) for cancer survival analysis. PASNet family is a biologically interpretable neural network model where nodes in the network correspond to specific genes and pathways, while capturing nonlinear and hierarchical effects of biological pathways associated with certain clinical outcomes. Furthermore, integration of heterogeneous types of biological data from biospecimen holds promise of improving survival prediction and personalized therapies in cancer. Specifically, the integration of genomic data and histopathological images enhances survival predictions and personalized treatments in cancer study, while providing an in-depth understanding of genetic mechanisms and phenotypic patterns of cancer. Two proposed models will be introduced for integrating multi-omics data and pathological images, respectively. Each model in PASNet family was evaluated by comparing the performance of current cutting-edge models with The Cancer Genome Atlas (TCGA) cancer data. In the extensive experiments, PASNet family outperformed the benchmarking methods, and the outstanding performance was statistically assessed. More importantly, PASNet family showed the capability to interpret a multi-layered biological system. A number of biological literature in GBM supported the biological interpretation of the proposed models. The open-source software of PASNet family in PyTorch is publicly available at https://github.com/DataX-JieHao
Interpretable Deep Neural Network for Cancer Survival Analysis by Integrating Genomic and Clinical Data
Background: Understanding the complex biological mechanisms of cancer patient survival using genomic and clinical data is vital, not only to develop new treatments for patients, but also to improve survival prediction. However, highly nonlinear and high-dimension, low-sample size (HDLSS) data cause computational challenges to applying conventional survival analysis. Results: We propose a novel biologically interpretable pathway-based sparse deep neural network, named Cox-PASNet, which integrates high-dimensional gene expression data and clinical data on a simple neural network architecture for survival analysis. Cox-PASNet is biologically interpretable where nodes in the neural network correspond to biological genes and pathways, while capturing the nonlinear and hierarchical effects of biological pathways associated with cancer patient survival. We also propose a heuristic optimization solution to train Cox-PASNet with HDLSS data. Cox-PASNet was intensively evaluated by comparing the predictive performance of current state-of-the-art methods on glioblastoma multiforme (GBM) and ovarian serous cystadenocarcinoma (OV) cancer. In the experiments, Cox-PASNet showed out-performance, compared to the benchmarking methods. Moreover, the neural network architecture of Cox-PASNet was biologically interpreted, and several significant prognostic factors of genes and biological pathways were identified. Conclusions: Cox-PASNet models biological mechanisms in the neural network by incorporating biological pathway databases and sparse coding. The neural network of Cox-PASNet can identify nonlinear and hierarchical associations of genomic and clinical data to cancer patient survival. The open-source code of Cox-PASNet in PyTorch implemented for training, evaluation, and model interpretation is available at: https://github.com/DataX-JieHao/Cox-PASNet
Deep Attentive Survival Analysis in Limit Order Books: Estimating Fill Probabilities with Convolutional-Transformers
One of the key decisions in execution strategies is the choice between a
passive (liquidity providing) or an aggressive (liquidity taking) order to
execute a trade in a limit order book (LOB). Essential to this choice is the
fill probability of a passive limit order placed in the LOB. This paper
proposes a deep learning method to estimate the filltimes of limit orders
posted in different levels of the LOB. We develop a novel model for survival
analysis that maps time-varying features of the LOB to the distribution of
filltimes of limit orders. Our method is based on a convolutional-Transformer
encoder and a monotonic neural network decoder. We use proper scoring rules to
compare our method with other approaches in survival analysis, and perform an
interpretability analysis to understand the informativeness of features used to
compute fill probabilities. Our method significantly outperforms those
typically used in survival analysis literature. Finally, we carry out a
statistical analysis of the fill probability of orders placed in the order book
(e.g., within the bid-ask spread) for assets with different queue dynamics and
trading activity
Application of neural networks and sensitivity analysis to improved prediction of trauma survival
Application of neural networks and sensitivity analysis to improved prediction of trauma surviva
Deep Landscape Forecasting for Real-time Bidding Advertising
The emergence of real-time auction in online advertising has drawn huge
attention of modeling the market competition, i.e., bid landscape forecasting.
The problem is formulated as to forecast the probability distribution of market
price for each ad auction. With the consideration of the censorship issue which
is caused by the second-price auction mechanism, many researchers have devoted
their efforts on bid landscape forecasting by incorporating survival analysis
from medical research field. However, most existing solutions mainly focus on
either counting-based statistics of the segmented sample clusters, or learning
a parameterized model based on some heuristic assumptions of distribution
forms. Moreover, they neither consider the sequential patterns of the feature
over the price space. In order to capture more sophisticated yet flexible
patterns at fine-grained level of the data, we propose a Deep Landscape
Forecasting (DLF) model which combines deep learning for probability
distribution forecasting and survival analysis for censorship handling.
Specifically, we utilize a recurrent neural network to flexibly model the
conditional winning probability w.r.t. each bid price. Then we conduct the bid
landscape forecasting through probability chain rule with strict mathematical
derivations. And, in an end-to-end manner, we optimize the model by minimizing
two negative likelihood losses with comprehensive motivations. Without any
specific assumption for the distribution form of bid landscape, our model shows
great advantages over previous works on fitting various sophisticated market
price distributions. In the experiments over two large-scale real-world
datasets, our model significantly outperforms the state-of-the-art solutions
under various metrics.Comment: KDD 2019. The reproducible code and dataset link is
https://github.com/rk2900/DL
- …