1,319 research outputs found
Multi-Task Recommendations with Reinforcement Learning
In recent years, Multi-task Learning (MTL) has yielded immense success in
Recommender System (RS) applications. However, current MTL-based recommendation
models tend to disregard the session-wise patterns of user-item interactions
because they are predominantly constructed based on item-wise datasets.
Moreover, balancing multiple objectives has always been a challenge in this
field, which is typically avoided via linear estimations in existing works. To
address these issues, in this paper, we propose a Reinforcement Learning (RL)
enhanced MTL framework, namely RMTL, to combine the losses of different
recommendation tasks using dynamic weights. To be specific, the RMTL structure
can address the two aforementioned issues by (i) constructing an MTL
environment from session-wise interactions and (ii) training multi-task
actor-critic network structure, which is compatible with most existing
MTL-based recommendation models, and (iii) optimizing and fine-tuning the MTL
loss function using the weights generated by critic networks. Experiments on
two real-world public datasets demonstrate the effectiveness of RMTL with a
higher AUC against state-of-the-art MTL-based recommendation models.
Additionally, we evaluate and validate RMTL's compatibility and transferability
across various MTL models.Comment: TheWebConf202
Advances and Challenges of Multi-task Learning Method in Recommender System: A Survey
Multi-task learning has been widely applied in computational vision, natural
language processing and other fields, which has achieved well performance. In
recent years, a lot of work about multi-task learning recommender system has
been yielded, but there is no previous literature to summarize these works. To
bridge this gap, we provide a systematic literature survey about multi-task
recommender systems, aiming to help researchers and practitioners quickly
understand the current progress in this direction. In this survey, we first
introduce the background and the motivation of the multi-task learning-based
recommender systems. Then we provide a taxonomy of multi-task learning-based
recommendation methods according to the different stages of multi-task learning
techniques, which including task relationship discovery, model architecture and
optimization strategy. Finally, we raise discussions on the application and
promising future directions in this area
Controllable Multi-Objective Re-ranking with Policy Hypernetworks
Multi-stage ranking pipelines have become widely used strategies in modern
recommender systems, where the final stage aims to return a ranked list of
items that balances a number of requirements such as user preference,
diversity, novelty etc. Linear scalarization is arguably the most widely used
technique to merge multiple requirements into one optimization objective, by
summing up the requirements with certain preference weights. Existing
final-stage ranking methods often adopt a static model where the preference
weights are determined during offline training and kept unchanged during online
serving. Whenever a modification of the preference weights is needed, the model
has to be re-trained, which is time and resources inefficient. Meanwhile, the
most appropriate weights may vary greatly for different groups of targeting
users or at different time periods (e.g., during holiday promotions). In this
paper, we propose a framework called controllable multi-objective re-ranking
(CMR) which incorporates a hypernetwork to generate parameters for a re-ranking
model according to different preference weights. In this way, CMR is enabled to
adapt the preference weights according to the environment changes in an online
manner, without retraining the models. Moreover, we classify practical
business-oriented tasks into four main categories and seamlessly incorporate
them in a new proposed re-ranking model based on an Actor-Evaluator framework,
which serves as a reliable real-world testbed for CMR. Offline experiments
based on the dataset collected from Taobao App showed that CMR improved several
popular re-ranking models by using them as underlying models. Online A/B tests
also demonstrated the effectiveness and trustworthiness of CMR
Tree-based Text-Vision BERT for Video Search in Baidu Video Advertising
The advancement of the communication technology and the popularity of the
smart phones foster the booming of video ads. Baidu, as one of the leading
search engine companies in the world, receives billions of search queries per
day. How to pair the video ads with the user search is the core task of Baidu
video advertising. Due to the modality gap, the query-to-video retrieval is
much more challenging than traditional query-to-document retrieval and
image-to-image search. Traditionally, the query-to-video retrieval is tackled
by the query-to-title retrieval, which is not reliable when the quality of
tiles are not high. With the rapid progress achieved in computer vision and
natural language processing in recent years, content-based search methods
becomes promising for the query-to-video retrieval. Benefited from pretraining
on large-scale datasets, some visionBERT methods based on cross-modal attention
have achieved excellent performance in many vision-language tasks not only in
academia but also in industry. Nevertheless, the expensive computation cost of
cross-modal attention makes it impractical for large-scale search in industrial
applications. In this work, we present a tree-based combo-attention network
(TCAN) which has been recently launched in Baidu's dynamic video advertising
platform. It provides a practical solution to deploy the heavy cross-modal
attention for the large-scale query-to-video search. After launching tree-based
combo-attention network, click-through rate gets improved by 2.29\% and
conversion rate get improved by 2.63\%.Comment: This revision is based on a manuscript submitted in October 2020, to
ICDE 2021. We thank the Program Committee for their valuable comment
Reinforcement Learning for Generative AI: A Survey
Deep Generative AI has been a long-standing essential topic in the machine
learning community, which can impact a number of application areas like text
generation and computer vision. The major paradigm to train a generative model
is maximum likelihood estimation, which pushes the learner to capture and
approximate the target data distribution by decreasing the divergence between
the model distribution and the target distribution. This formulation
successfully establishes the objective of generative tasks, while it is
incapable of satisfying all the requirements that a user might expect from a
generative model. Reinforcement learning, serving as a competitive option to
inject new training signals by creating new objectives that exploit novel
signals, has demonstrated its power and flexibility to incorporate human
inductive bias from multiple angles, such as adversarial learning,
hand-designed rules and learned reward model to build a performant model.
Thereby, reinforcement learning has become a trending research field and has
stretched the limits of generative AI in both model design and application. It
is reasonable to summarize and conclude advances in recent years with a
comprehensive review. Although there are surveys in different application areas
recently, this survey aims to shed light on a high-level review that spans a
range of application areas. We provide a rigorous taxonomy in this area and
make sufficient coverage on various models and applications. Notably, we also
surveyed the fast-developing large language model area. We conclude this survey
by showing the potential directions that might tackle the limit of current
models and expand the frontiers for generative AI
Impression-Aware Recommender Systems
Novel data sources bring new opportunities to improve the quality of
recommender systems. Impressions are a novel data source containing past
recommendations (shown items) and traditional interactions. Researchers may use
impressions to refine user preferences and overcome the current limitations in
recommender systems research. The relevance and interest of impressions have
increased over the years; hence, the need for a review of relevant work on this
type of recommenders. We present a systematic literature review on recommender
systems using impressions, focusing on three fundamental angles in research:
recommenders, datasets, and evaluation methodologies. We provide three
categorizations of papers describing recommenders using impressions, present
each reviewed paper in detail, describe datasets with impressions, and analyze
the existing evaluation methodologies. Lastly, we present open questions and
future directions of interest, highlighting aspects missing in the literature
that can be addressed in future works.Comment: 34 pages, 103 references, 6 tables, 2 figures, ACM UNDER REVIE
Evaluating the Robustness of GAN-Based Inverse Reinforcement Learning Algorithms
We evaluate the robustness of reward functions learned with IRL, when transferred to similar tasks. We exceed state of the art results for one benchmark task and solve another one for the first time. Modifications are proposed that achieve faster and more stable training
- …