43 research outputs found
MONET: Modality-Embracing Graph Convolutional Network and Target-Aware Attention for Multimedia Recommendation
In this paper, we focus on multimedia recommender systems using graph
convolutional networks (GCNs) where the multimodal features as well as
user-item interactions are employed together. Our study aims to exploit
multimodal features more effectively in order to accurately capture users'
preferences for items. To this end, we point out following two limitations of
existing GCN-based multimedia recommender systems: (L1) although multimodal
features of interacted items by a user can reveal her preferences on items,
existing methods utilize GCN designed to focus only on capturing collaborative
signals, resulting in insufficient reflection of the multimodal features in the
final user/item embeddings; (L2) although a user decides whether to prefer the
target item by considering its multimodal features, existing methods represent
her as only a single embedding regardless of the target item's multimodal
features and then utilize her embedding to predict her preference for the
target item. To address the above issues, we propose a novel multimedia
recommender system, named MONET, composed of following two core ideas:
modality-embracing GCN (MeGCN) and target-aware attention. Through extensive
experiments using four real-world datasets, we demonstrate i) the significant
superiority of MONET over seven state-of-the-art competitors (up to 30.32%
higher accuracy in terms of recall@20, compared to the best competitor) and ii)
the effectiveness of the two core ideas in MONET. All MONET codes are available
at https://github.com/Kimyungi/MONET.Comment: Accepted by WSDM 202
Adversarial Training Towards Robust Multimedia Recommender System
With the prevalence of multimedia content on the Web, developing recommender
solutions that can effectively leverage the rich signal in multimedia data is
in urgent need. Owing to the success of deep neural networks in representation
learning, recent advance on multimedia recommendation has largely focused on
exploring deep learning methods to improve the recommendation accuracy. To
date, however, there has been little effort to investigate the robustness of
multimedia representation and its impact on the performance of multimedia
recommendation.
In this paper, we shed light on the robustness of multimedia recommender
system. Using the state-of-the-art recommendation framework and deep image
features, we demonstrate that the overall system is not robust, such that a
small (but purposeful) perturbation on the input image will severely decrease
the recommendation accuracy. This implies the possible weakness of multimedia
recommender system in predicting user preference, and more importantly, the
potential of improvement by enhancing its robustness. To this end, we propose a
novel solution named Adversarial Multimedia Recommendation (AMR), which can
lead to a more robust multimedia recommender model by using adversarial
learning. The idea is to train the model to defend an adversary, which adds
perturbations to the target image with the purpose of decreasing the model's
accuracy. We conduct experiments on two representative multimedia
recommendation tasks, namely, image recommendation and visually-aware product
recommendation. Extensive results verify the positive effect of adversarial
learning and demonstrate the effectiveness of our AMR method. Source codes are
available in https://github.com/duxy-me/AMR.Comment: TKD
Formalizing Multimedia Recommendation through Multimodal Deep Learning
Recommender systems (RSs) offer personalized navigation experiences on online
platforms, but recommendation remains a challenging task, particularly in
specific scenarios and domains. Multimodality can help tap into richer
information sources and construct more refined user/item profiles for
recommendations. However, existing literature lacks a shared and universal
schema for modeling and solving the recommendation problem through the lens of
multimodality. This work aims to formalize a general multimodal schema for
multimedia recommendation. It provides a comprehensive literature review of
multimodal approaches for multimedia recommendation from the last eight years,
outlines the theoretical foundations of a multimodal pipeline, and demonstrates
its rationale by applying it to selected state-of-the-art approaches. The work
also conducts a benchmarking analysis of recent algorithms for multimedia
recommendation within Elliot, a rigorous framework for evaluating recommender
systems. The main aim is to provide guidelines for designing and implementing
the next generation of multimodal approaches in multimedia recommendation
Visual BFI: an Exploratory Study for Image-based Personality Test
This paper positions and explores the topic of image-based personality test.
Instead of responding to text-based questions, the subjects will be provided a
set of "choose-your-favorite-image" visual questions. With the image options of
each question belonging to the same concept, the subjects' personality traits
are estimated by observing their preferences of images under several unique
concepts. The solution to design such an image-based personality test consists
of concept-question identification and image-option selection. We have
presented a preliminary framework to regularize these two steps in this
exploratory study. A demo version of the designed image-based personality test
is available at http://www.visualbfi.org/. Subjective as well as objective
evaluations have demonstrated the feasibility of image-based personality test
in limited questions
Social and content hybrid image recommender system for mobile social networks
One of the advantages of social networks is the possibility to socialize and personalize the content created or shared by the users. In mobile social networks, where the devices have limited capabilities in terms of screen size and computing power, Multimedia Recommender Systems help to present the most relevant content to the users, depending on their tastes, relationships and profile. Previous recommender systems are not able to cope with the uncertainty of automated tagging and are knowledge domain dependant. In addition, the instantiation of a recommender in this domain should cope with problems arising from the collaborative filtering inherent nature (cold start, banana problem, large number of users to run, etc.). The solution presented in this paper addresses the abovementioned problems by proposing a hybrid image recommender system, which combines collaborative filtering (social techniques) with content-based techniques, leaving the user the liberty to give these processes a personal weight. It takes into account aesthetics and the formal characteristics of the images to overcome the problems of current techniques, improving the performance of existing systems to create a mobile social networks recommender with a high degree of adaptation to any kind of user