7,690 research outputs found
A Hierarchical Recurrent Encoder-Decoder For Generative Context-Aware Query Suggestion
Users may strive to formulate an adequate textual query for their information
need. Search engines assist the users by presenting query suggestions. To
preserve the original search intent, suggestions should be context-aware and
account for the previous queries issued by the user. Achieving context
awareness is challenging due to data sparsity. We present a probabilistic
suggestion model that is able to account for sequences of previous queries of
arbitrary lengths. Our novel hierarchical recurrent encoder-decoder
architecture allows the model to be sensitive to the order of queries in the
context while avoiding data sparsity. Additionally, our model can suggest for
rare, or long-tail, queries. The produced suggestions are synthetic and are
sampled one word at a time, using computationally cheap decoding techniques.
This is in contrast to current synthetic suggestion models relying upon machine
learning pipelines and hand-engineered feature sets. Results show that it
outperforms existing context-aware approaches in a next query prediction
setting. In addition to query suggestion, our model is general enough to be
used in a variety of other applications.Comment: To appear in Conference of Information Knowledge and Management
(CIKM) 201
Rethinking Missing Data: Aleatoric Uncertainty-Aware Recommendation
Historical interactions are the default choice for recommender model
training, which typically exhibit high sparsity, i.e., most user-item pairs are
unobserved missing data. A standard choice is treating the missing data as
negative training samples and estimating interaction likelihood between
user-item pairs along with the observed interactions. In this way, some
potential interactions are inevitably mislabeled during training, which will
hurt the model fidelity, hindering the model to recall the mislabeled items,
especially the long-tail ones. In this work, we investigate the mislabeling
issue from a new perspective of aleatoric uncertainty, which describes the
inherent randomness of missing data. The randomness pushes us to go beyond
merely the interaction likelihood and embrace aleatoric uncertainty modeling.
Towards this end, we propose a new Aleatoric Uncertainty-aware Recommendation
(AUR) framework that consists of a new uncertainty estimator along with a
normal recommender model. According to the theory of aleatoric uncertainty, we
derive a new recommendation objective to learn the estimator. As the chance of
mislabeling reflects the potential of a pair, AUR makes recommendations
according to the uncertainty, which is demonstrated to improve the
recommendation performance of less popular items without sacrificing the
overall performance. We instantiate AUR on three representative recommender
models: Matrix Factorization (MF), LightGCN, and VAE from mainstream model
architectures. Extensive results on two real-world datasets validate the
effectiveness of AUR w.r.t. better recommendation results, especially on
long-tail items
Improving Negative Sampling for Word Representation using Self-embedded Features
Although the word-popularity based negative sampler has shown superb
performance in the skip-gram model, the theoretical motivation behind
oversampling popular (non-observed) words as negative samples is still not well
understood. In this paper, we start from an investigation of the gradient
vanishing issue in the skipgram model without a proper negative sampler. By
performing an insightful analysis from the stochastic gradient descent (SGD)
learning perspective, we demonstrate that, both theoretically and intuitively,
negative samples with larger inner product scores are more informative than
those with lower scores for the SGD learner in terms of both convergence rate
and accuracy. Understanding this, we propose an alternative sampling algorithm
that dynamically selects informative negative samples during each SGD update.
More importantly, the proposed sampler accounts for multi-dimensional
self-embedded features during the sampling process, which essentially makes it
more effective than the original popularity-based (one-dimensional) sampler.
Empirical experiments further verify our observations, and show that our
fine-grained samplers gain significant improvement over the existing ones
without increasing computational complexity.Comment: Accepted in WSDM 201
The Dynamics of Viral Marketing
We present an analysis of a person-to-person recommendation network,
consisting of 4 million people who made 16 million recommendations on half a
million products. We observe the propagation of recommendations and the cascade
sizes, which we explain by a simple stochastic model. We analyze how user
behavior varies within user communities defined by a recommendation network.
Product purchases follow a 'long tail' where a significant share of purchases
belongs to rarely sold items. We establish how the recommendation network grows
over time and how effective it is from the viewpoint of the sender and receiver
of the recommendations. While on average recommendations are not very effective
at inducing purchases and do not spread very far, we present a model that
successfully identifies communities, product and pricing categories for which
viral marketing seems to be very effective
ICMRec: Item Cluster-Wise Multi-Objective Optimization for Unbiased Recommendation
The traditional observed data used to train the recommender model suffers
from severe bias issues (e.g., exposure bias, popularity bias). Interactions of
a small fraction of head items account for almost the whole training data. The
normal training paradigm from such biased data tends to repetitively generate
recommendations from the head items, which further exacerbates the biases and
affects the exploration of potentially interesting items from the niche set. In
this work, distinct from existing methods, we innovatively explore the central
theme of unbiased recommendation from an item cluster-wise multi-objective
optimization perspective. Aiming to balance the learning on various item
clusters that differ in popularity during the training process, we characterize
the recommendation task as an item cluster-wise multi-objective optimization
problem. To this end, we propose a model-agnostic framework namely Item
Cluster-Wise Multi-Objective Recommendation (ICMRec) for unbiased
recommendation. In detail, we define our item cluster-wise optimization target
that the recommender model should balance all item clusters that differ in
popularity. Thus we set the model learning on each item cluster as a unique
optimization objective. To achieve this goal, we first explore items'
popularity levels from a novel causal reasoning perspective. Then, we devise
popularity discrepancy-based bisecting clustering to separate the discriminated
item clusters. Next, we adaptively find the overall harmonious gradient
direction for multiple item cluster-wise optimization objectives from a
Pareto-efficient solver. Finally, in the prediction stage, we perform
counterfactual inference to further eliminate the impact of user conformity.
Extensive experimental results demonstrate the superiorities of ICMRec on
overall recommendation performance and biases elimination. Codes will be
open-source upon acceptance
- …