354,292 research outputs found
Sequential Recommendation with Self-Attentive Multi-Adversarial Network
Recently, deep learning has made significant progress in the task of
sequential recommendation. Existing neural sequential recommenders typically
adopt a generative way trained with Maximum Likelihood Estimation (MLE). When
context information (called factor) is involved, it is difficult to analyze
when and how each individual factor would affect the final recommendation
performance. For this purpose, we take a new perspective and introduce
adversarial learning to sequential recommendation. In this paper, we present a
Multi-Factor Generative Adversarial Network (MFGAN) for explicitly modeling the
effect of context information on sequential recommendation. Specifically, our
proposed MFGAN has two kinds of modules: a Transformer-based generator taking
user behavior sequences as input to recommend the possible next items, and
multiple factor-specific discriminators to evaluate the generated sub-sequence
from the perspectives of different factors. To learn the parameters, we adopt
the classic policy gradient method, and utilize the reward signal of
discriminators for guiding the learning of the generator. Our framework is
flexible to incorporate multiple kinds of factor information, and is able to
trace how each factor contributes to the recommendation decision over time.
Extensive experiments conducted on three real-world datasets demonstrate the
superiority of our proposed model over the state-of-the-art methods, in terms
of effectiveness and interpretability
Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder
In this paper, we present a hierarchical path planning framework called SG-RL
(subgoal graphs-reinforcement learning), to plan rational paths for agents
maneuvering in continuous and uncertain environments. By "rational", we mean
(1) efficient path planning to eliminate first-move lags; (2) collision-free
and smooth for agents with kinematic constraints satisfied. SG-RL works in a
two-level manner. At the first level, SG-RL uses a geometric path-planning
method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract
paths, also called subgoal sequences. At the second level, SG-RL uses an RL
method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal
motion-planning policies which can generate kinematically feasible and
collision-free trajectories between adjacent subgoals. The first advantage of
the proposed method is that SSG can solve the limitations of sparse reward and
local minima trap for RL agents; thus, LSPI can be used to generate paths in
complex environments. The second advantage is that, when the environment
changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to
reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI
can deal with uncertainties by exploiting its generalization ability to handle
changes in environments. Simulation experiments in representative scenarios
demonstrate that, compared with existing methods, SG-RL can work well on
large-scale maps with relatively low action-switching frequencies and shorter
path lengths, and SG-RL can deal with small changes in environments. We further
demonstrate that the design of reward functions and the types of training
environments are important factors for learning feasible policies.Comment: 20 page
- …