43 research outputs found
Behavior Sequence Transformer for E-commerce Recommendation in Alibaba
Deep learning based methods have been widely used in industrial
recommendation systems (RSs). Previous works adopt an Embedding&MLP paradigm:
raw features are embedded into low-dimensional vectors, which are then fed on
to MLP for final recommendations. However, most of these works just concatenate
different features, ignoring the sequential nature of users' behaviors. In this
paper, we propose to use the powerful Transformer model to capture the
sequential signals underlying users' behavior sequences for recommendation in
Alibaba. Experimental results demonstrate the superiority of the proposed
model, which is then deployed online at Taobao and obtain significant
improvements in online Click-Through-Rate (CTR) comparing to two baselines.Comment: 4 pages, 1 figur
Globally Optimized Mutual Influence Aware Ranking in E-Commerce Search
In web search, mutual influences between documents have been studied from the
perspective of search result diversification. But the methods in web search is
not directly applicable to e-commerce search because of their differences. And
little research has been done on the mutual influences between items in
e-commerce search. We propose a global optimization framework for mutual
influence aware ranking in e-commerce search. Our framework directly optimizes
the Gross Merchandise Volume (GMV) for ranking, and decomposes ranking into two
tasks. The first task is mutual influence aware purchase probability
estimation. We propose a global feature extension method to incorporate mutual
influences into the features of an item. We also use Recurrent Neural Network
(RNN) to capture influences related to ranking orders in purchase probability
estimation. The second task is to find the best ranking order based on the
purchase probability estimations. We treat the second task as a sequence
generation problem and solved it using the beam search algorithm. We performed
online A/B test on a large e-commerce search engine. The results show that our
method brings a 5% increase in GMV for the search engine over a strong
baseline
Multi-Source Pointer Network for Product Title Summarization
In this paper, we study the product title summarization problem in E-commerce
applications for display on mobile devices. Comparing with conventional
sentence summarization, product title summarization has some extra and
essential constraints. For example, factual errors or loss of the key
information are intolerable for E-commerce applications. Therefore, we abstract
two more constraints for product title summarization: (i) do not introduce
irrelevant information; (ii) retain the key information (e.g., brand name and
commodity name). To address these issues, we propose a novel multi-source
pointer network by adding a new knowledge encoder for pointer network. The
first constraint is handled by pointer mechanism. For the second constraint, we
restore the key information by copying words from the knowledge encoder with
the help of the soft gating mechanism. For evaluation, we build a large
collection of real-world product titles along with human-written short titles.
Experimental results demonstrate that our model significantly outperforms the
other baselines. Finally, online deployment of our proposed model has yielded a
significant business impact, as measured by the click-through rate.Comment: 10 pages, To appear in CIKM 2018, fix mistakes in dataset stat
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
Modeling users' dynamic and evolving preferences from their historical
behaviors is challenging and crucial for recommendation systems. Previous
methods employ sequential neural networks (e.g., Recurrent Neural Network) to
encode users' historical interactions from left to right into hidden
representations for making recommendations. Although these methods achieve
satisfactory results, they often assume a rigidly ordered sequence which is not
always practical. We argue that such left-to-right unidirectional architectures
restrict the power of the historical sequence representations. For this
purpose, we introduce a Bidirectional Encoder Representations from Transformers
for sequential Recommendation (BERT4Rec). However, jointly conditioning on both
left and right context in deep bidirectional model would make the training
become trivial since each item can indirectly "see the target item". To address
this problem, we train the bidirectional model using the Cloze task, predicting
the masked items in the sequence by jointly conditioning on their left and
right context. Comparing with predicting the next item at each position in a
sequence, the Cloze task can produce more samples to train a more powerful
bidirectional model. Extensive experiments on four benchmark datasets show that
our model outperforms various state-of-the-art sequential models consistently.Comment: To appear in CIKM 201
Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning
Ranking is a fundamental and widely studied problem in scenarios such as
search, advertising, and recommendation. However, joint optimization for
multi-scenario ranking, which aims to improve the overall performance of
several ranking strategies in different scenarios, is rather untouched.
Separately optimizing each individual strategy has two limitations. The first
one is lack of collaboration between scenarios meaning that each strategy
maximizes its own objective but ignores the goals of other strategies, leading
to a sub-optimal overall performance. The second limitation is the inability of
modeling the correlation between scenarios meaning that independent
optimization in one scenario only uses its own user data but ignores the
context in other scenarios.
In this paper, we formulate multi-scenario ranking as a fully cooperative,
partially observable, multi-agent sequential decision problem. We propose a
novel model named Multi-Agent Recurrent Deterministic Policy Gradient (MA-RDPG)
which has a communication component for passing messages, several private
actors (agents) for making actions for ranking, and a centralized critic for
evaluating the overall performance of the co-working actors. Each scenario is
treated as an agent (actor). Agents collaborate with each other by sharing a
global action-value function (the critic) and passing messages that encodes
historical information across scenarios. The model is evaluated with online
settings on a large E-commerce platform. Results show that the proposed model
exhibits significant improvements against baselines in terms of the overall
performance.Comment: WWW201
Exact-K Recommendation via Maximal Clique Optimization
This paper targets to a novel but practical recommendation problem named
exact-K recommendation. It is different from traditional top-K recommendation,
as it focuses more on (constrained) combinatorial optimization which will
optimize to recommend a whole set of K items called card, rather than ranking
optimization which assumes that "better" items should be put into top
positions. Thus we take the first step to give a formal problem definition, and
innovatively reduce it to Maximum Clique Optimization based on graph. To tackle
this specific combinatorial optimization problem which is NP-hard, we propose
Graph Attention Networks (GAttN) with a Multi-head Self-attention encoder and a
decoder with attention mechanism. It can end-to-end learn the joint
distribution of the K items and generate an optimal card rather than rank
individual items by prediction scores. Then we propose Reinforcement Learning
from Demonstrations (RLfD) which combines the advantages in behavior cloning
and reinforcement learning, making it sufficient- and-efficient to train the
model. Extensive experiments on three datasets demonstrate the effectiveness of
our proposed GAttN with RLfD method, it outperforms several strong baselines
with a relative improvement of 7.7% and 4.7% on average in Precision and Hit
Ratio respectively, and achieves state-of-the-art (SOTA) performance for the
exact-K recommendation problem.Comment: SIGKDD 201
Perceive Your Users in Depth: Learning Universal User Representations from Multiple E-commerce Tasks
Tasks such as search and recommendation have become increas- ingly important
for E-commerce to deal with the information over- load problem. To meet the
diverse needs of di erent users, person- alization plays an important role. In
many large portals such as Taobao and Amazon, there are a bunch of di erent
types of search and recommendation tasks operating simultaneously for person-
alization. However, most of current techniques address each task separately.
This is suboptimal as no information about users shared across di erent tasks.
In this work, we propose to learn universal user representations across
multiple tasks for more e ective personalization. In partic- ular, user
behavior sequences (e.g., click, bookmark or purchase of products) are modeled
by LSTM and attention mechanism by integrating all the corresponding content,
behavior and temporal information. User representations are shared and learned
in an end-to-end setting across multiple tasks. Bene ting from better
information utilization of multiple tasks, the user representations are more e
ective to re ect their interests and are more general to be transferred to new
tasks. We refer this work as Deep User Perception Network (DUPN) and conduct an
extensive set of o ine and online experiments. Across all tested ve di erent
tasks, our DUPN consistently achieves better results by giving more e ective
user representations. Moreover, we deploy DUPN in large scale operational tasks
in Taobao. Detailed implementations, e.g., incre- mental model updating, are
also provided to address the practical issues for the real world applications.Comment: 10 pages, accepted an oral paper in sigKDD2018(industry track
Revisit Recommender System in the Permutation Prospective
Recommender systems (RS) work effective at alleviating information overload
and matching user interests in various web-scale applications. Most RS retrieve
the user's favorite candidates and then rank them by the rating scores in the
greedy manner. In the permutation prospective, however, current RS come to
reveal the following two limitations: 1) They neglect addressing the
permutation-variant influence within the recommended results; 2) Permutation
consideration extends the latent solution space exponentially, and current RS
lack the ability to evaluate the permutations. Both drive RS away from the
permutation-optimal recommended results and better user experience. To
approximate the permutation-optimal recommended results effectively and
efficiently, we propose a novel permutation-wise framework PRS in the
re-ranking stage of RS, which consists of Permutation-Matching (PMatch) and
Permutation-Ranking (PRank) stages successively. Specifically, the PMatch stage
is designed to obtain the candidate list set, where we propose the FPSA
algorithm to generate multiple candidate lists via the permutation-wise and
goal-oriented beam search algorithm. Afterwards, for the candidate list set,
the PRank stage provides a unified permutation-wise ranking criterion named LR
metric, which is calculated by the rating scores of elaborately designed
permutation-wise model DPWN. Finally, the list with the highest LR score is
recommended to the user. Empirical results show that PRS consistently and
significantly outperforms state-of-the-art methods. Moreover, PRS has achieved
a performance improvement of 11.0% on PV metric and 8.7% on IPV metric after
the successful deployment in one popular recommendation scenario of Taobao
application.Comment: Under the review of the KDD2021 Applied Data Science trac
Semi-supervised Collaborative Filtering by Text-enhanced Domain Adaptation
Data sparsity is an inherent challenge in the recommender systems, where most
of the data is collected from the implicit feedbacks of users. This causes two
difficulties in designing effective algorithms: first, the majority of users
only have a few interactions with the system and there is no enough data for
learning; second, there are no negative samples in the implicit feedbacks and
it is a common practice to perform negative sampling to generate negative
samples. However, this leads to a consequence that many potential positive
samples are mislabeled as negative ones and data sparsity would exacerbate the
mislabeling problem. To solve these difficulties, we regard the problem of
recommendation on sparse implicit feedbacks as a semi-supervised learning task,
and explore domain adaption to solve it. We transfer the knowledge learned from
dense data to sparse data and we focus on the most challenging case -- there is
no user or item overlap. In this extreme case, aligning embeddings of two
datasets directly is rather sub-optimal since the two latent spaces encode very
different information. As such, we adopt domain-invariant textual features as
the anchor points to align the latent spaces. To align the embeddings, we
extract the textual features for each user and item and feed them into a domain
classifier with the embeddings of users and items. The embeddings are trained
to puzzle the classifier and textual features are fixed as anchor points. By
domain adaptation, the distribution pattern in the source domain is transferred
to the target domain. As the target part can be supervised by domain
adaptation, we abandon negative sampling in target dataset to avoid label
noise. We adopt three pairs of real-world datasets to validate the
effectiveness of our transfer strategy. Results show that our models outperform
existing models significantly.Comment: KDD 2020 pape
Personalized Re-ranking for Recommendation
Ranking is a core task in recommender systems, which aims at providing an
ordered list of items to users. Typically, a ranking function is learned from
the labeled dataset to optimize the global performance, which produces a
ranking score for each individual item. However, it may be sub-optimal because
the scoring function applies to each item individually and does not explicitly
consider the mutual influence between items, as well as the differences of
users' preferences or intents. Therefore, we propose a personalized re-ranking
model for recommender systems. The proposed re-ranking model can be easily
deployed as a follow-up modular after any ranking algorithm, by directly using
the existing ranking feature vectors. It directly optimizes the whole
recommendation list by employing a transformer structure to efficiently encode
the information of all items in the list. Specifically, the Transformer applies
a self-attention mechanism that directly models the global relationships
between any pair of items in the whole list. We confirm that the performance
can be further improved by introducing pre-trained embedding to learn
personalized encoding functions for different users. Experimental results on
both offline benchmarks and real-world online e-commerce systems demonstrate
the significant improvements of the proposed re-ranking model.Comment: 9 page