4,200 research outputs found
Product-based Neural Networks for User Response Prediction
Predicting user responses, such as clicks and conversions, is of great
importance and has found its usage in many Web applications including
recommender systems, web search and online advertising. The data in those
applications is mostly categorical and contains multiple fields; a typical
representation is to transform it into a high-dimensional sparse binary feature
representation via one-hot encoding. Facing with the extreme sparsity,
traditional models may limit their capacity of mining shallow patterns from the
data, i.e. low-order feature combinations. Deep models like deep neural
networks, on the other hand, cannot be directly applied for the
high-dimensional input because of the huge feature space. In this paper, we
propose a Product-based Neural Networks (PNN) with an embedding layer to learn
a distributed representation of the categorical data, a product layer to
capture interactive patterns between inter-field categories, and further fully
connected layers to explore high-order feature interactions. Our experimental
results on two large-scale real-world ad click datasets demonstrate that PNNs
consistently outperform the state-of-the-art models on various metrics.Comment: 6 pages, 5 figures, ICDM201
Bidirectional branch and bound for controlled variable selection. Part III: local average loss minimization
The selection of controlled variables (CVs) from available measurements through
exhaustive search is computationally forbidding for large-scale processes. We
have recently proposed novel bidirectional branch and bound (B-3) approaches for
CV selection using the minimum singular value (MSV) rule and the local worst-
case loss criterion in the framework of self-optimizing control. However, the
MSV rule is approximate and worst-case scenario may not occur frequently in
practice. Thus, CV selection by minimizing local average loss can be deemed as
most reliable. In this work, the B-3 approach is extended to CV selection based
on local average loss metric. Lower bounds on local average loss and, fast
pruning and branching algorithms are derived for the efficient B-3 algorithm.
Random matrices and binary distillation column case study are used to
demonstrate the computational efficiency of the proposed method
- …