599 research outputs found
AMER: Automatic Behavior Modeling and Interaction Exploration in Recommender System
User behavior and feature interactions are crucial in deep learning-based
recommender systems. There has been a diverse set of behavior modeling and
interaction exploration methods in the literature. Nevertheless, the design of
task-aware recommender systems still requires feature engineering and
architecture engineering from domain experts. In this work, we introduce AMER,
namely Automatic behavior Modeling and interaction Exploration in Recommender
systems with Neural Architecture Search (NAS). The core contributions of AMER
include the three-stage search space and the tailored three-step searching
pipeline. In the first step, AMER searches for residual blocks that incorporate
commonly used operations in the block-wise search space of stage 1 to model
sequential patterns in user behavior. In the second step, it progressively
investigates useful low-order and high-order feature interactions in the
non-sequential interaction space of stage 2. Finally, an aggregation
multi-layer perceptron (MLP) with shortcut connection is selected from flexible
dimension settings of stage~3 to combine features extracted from the previous
steps. For efficient and effective NAS, AMER employs the one-shot random search
in all three steps. Further analysis reveals that AMER's search space could
cover most of the representative behavior extraction and interaction
investigation methods, which demonstrates the universality of our design. The
extensive experimental results over various scenarios reveal that AMER could
outperform competitive baselines with elaborate feature engineering and
architecture engineering, indicating both effectiveness and robustness of the
proposed method
Modeling Multi-aspect Preferences and Intents for Multi-behavioral Sequential Recommendation
Multi-behavioral sequential recommendation has recently attracted increasing
attention. However, existing methods suffer from two major limitations.
Firstly, user preferences and intents can be described in fine-grained detail
from multiple perspectives; yet, these methods fail to capture their
multi-aspect nature. Secondly, user behaviors may contain noises, and most
existing methods could not effectively deal with noises. In this paper, we
present an attentive recurrent model with multiple projections to capture
Multi-Aspect preferences and INTents (MAINT in short). To extract multi-aspect
preferences from target behaviors, we propose a multi-aspect projection
mechanism for generating multiple preference representations from multiple
aspects. To extract multi-aspect intents from multi-typed behaviors, we propose
a behavior-enhanced LSTM and a multi-aspect refinement attention mechanism. The
attention mechanism can filter out noises and generate multiple intent
representations from different aspects. To adaptively fuse user preferences and
intents, we propose a multi-aspect gated fusion mechanism. Extensive
experiments conducted on real-world datasets have demonstrated the
effectiveness of our model
Recommended from our members
Visual Dynamics Models for Robotic Planning and Control
For a robot to interact with its environment, it must perceive the world and understand how the world evolves as a consequence of its actions. This thesis studies a few methods that a robot can use to respond to its observations, with a focus on instances that can leverage visual dynamic models. In general, these are models of how the visual observations of a robot evolves as a consequence of its actions. This could be in the form of predictive models that directly predict the future in the space of image pixels, in the space of visual features extracted from these images, or in the space of compact learned latent representations. The three instances that this thesis studies are in the context of visual servoing, visual planning, and representation learning for reinforcement learning. In the first case, we combine learned visual features with learning single-step predictive dynamics models and reinforcement learning to learn visual servoing mechanisms. In the second case, we use a deterministic multi-step video prediction model to achieve various manipulation tasks through visual planning. In addition, we show that conventional video prediction models are unequipped to model uncertainty and multiple futures, which could limit the planning capabilities of the robot. To address this, we propose a stochastic video prediction model that is trained with a combination of variational losses, adversarial losses, and perceptual losses, and show that this model can predict futures that are more realistic, diverse, and accurate. Unlike the first two cases, in which the dynamics model is used to make predictions for decision-making, the third case learns the model solely for representation learning. We learn a stochastic sequential latent variable model to learn a latent representation, and then use it as an intermediate representation for reinforcement learning. We show that this approach improves final performance and sample efficiency
Transfer Learning via Contextual Invariants for One-to-Many Cross-Domain Recommendation
The rapid proliferation of new users and items on the social web has
aggravated the gray-sheep user/long-tail item challenge in recommender systems.
Historically, cross-domain co-clustering methods have successfully leveraged
shared users and items across dense and sparse domains to improve inference
quality. However, they rely on shared rating data and cannot scale to multiple
sparse target domains (i.e., the one-to-many transfer setting). This, combined
with the increasing adoption of neural recommender architectures, motivates us
to develop scalable neural layer-transfer approaches for cross-domain learning.
Our key intuition is to guide neural collaborative filtering with
domain-invariant components shared across the dense and sparse domains,
improving the user and item representations learned in the sparse domains. We
leverage contextual invariances across domains to develop these shared modules,
and demonstrate that with user-item interaction context, we can learn-to-learn
informative representation spaces even with sparse interaction data. We show
the effectiveness and scalability of our approach on two public datasets and a
massive transaction dataset from Visa, a global payments technology company
(19% Item Recall, 3x faster vs. training separate models for each domain). Our
approach is applicable to both implicit and explicit feedback settings.Comment: SIGIR 202
- …