11,031 research outputs found
Adversarial Training Towards Robust Multimedia Recommender System
With the prevalence of multimedia content on the Web, developing recommender
solutions that can effectively leverage the rich signal in multimedia data is
in urgent need. Owing to the success of deep neural networks in representation
learning, recent advance on multimedia recommendation has largely focused on
exploring deep learning methods to improve the recommendation accuracy. To
date, however, there has been little effort to investigate the robustness of
multimedia representation and its impact on the performance of multimedia
recommendation.
In this paper, we shed light on the robustness of multimedia recommender
system. Using the state-of-the-art recommendation framework and deep image
features, we demonstrate that the overall system is not robust, such that a
small (but purposeful) perturbation on the input image will severely decrease
the recommendation accuracy. This implies the possible weakness of multimedia
recommender system in predicting user preference, and more importantly, the
potential of improvement by enhancing its robustness. To this end, we propose a
novel solution named Adversarial Multimedia Recommendation (AMR), which can
lead to a more robust multimedia recommender model by using adversarial
learning. The idea is to train the model to defend an adversary, which adds
perturbations to the target image with the purpose of decreasing the model's
accuracy. We conduct experiments on two representative multimedia
recommendation tasks, namely, image recommendation and visually-aware product
recommendation. Extensive results verify the positive effect of adversarial
learning and demonstrate the effectiveness of our AMR method. Source codes are
available in https://github.com/duxy-me/AMR.Comment: TKD
Human Pose Driven Object Effects Recommendation
In this paper, we research the new topic of object effects recommendation in
micro-video platforms, which is a challenging but important task for many
practical applications such as advertisement insertion. To avoid the problem of
introducing background bias caused by directly learning video content from
image frames, we propose to utilize the meaningful body language hidden in 3D
human pose for recommendation. To this end, in this work, a novel human pose
driven object effects recommendation network termed PoseRec is introduced.
PoseRec leverages the advantages of 3D human pose detection and learns
information from multi-frame 3D human pose for video-item registration,
resulting in high quality object effects recommendation performance. Moreover,
to solve the inherent ambiguity and sparsity issues that exist in object
effects recommendation, we further propose a novel item-aware implicit
prototype learning module and a novel pose-aware transductive hard-negative
mining module to better learn pose-item relationships. What's more, to
benchmark methods for the new research topic, we build a new dataset for object
effects recommendation named Pose-OBE. Extensive experiments on Pose-OBE
demonstrate that our method can achieve superior performance than strong
baselines
Knowledge-aware Complementary Product Representation Learning
Learning product representations that reflect complementary relationship
plays a central role in e-commerce recommender system. In the absence of the
product relationships graph, which existing methods rely on, there is a need to
detect the complementary relationships directly from noisy and sparse customer
purchase activities. Furthermore, unlike simple relationships such as
similarity, complementariness is asymmetric and non-transitive. Standard usage
of representation learning emphasizes on only one set of embedding, which is
problematic for modelling such properties of complementariness. We propose
using knowledge-aware learning with dual product embedding to solve the above
challenges. We encode contextual knowledge into product representation by
multi-task learning, to alleviate the sparsity issue. By explicitly modelling
with user bias terms, we separate the noise of customer-specific preferences
from the complementariness. Furthermore, we adopt the dual embedding framework
to capture the intrinsic properties of complementariness and provide geometric
interpretation motivated by the classic separating hyperplane theory. Finally,
we propose a Bayesian network structure that unifies all the components, which
also concludes several popular models as special cases. The proposed method
compares favourably to state-of-art methods, in downstream classification and
recommendation tasks. We also develop an implementation that scales efficiently
to a dataset with millions of items and customers
Generative Recommendation: Towards Next-generation Recommender Paradigm
Recommender systems typically retrieve items from an item corpus for
personalized recommendations. However, such a retrieval-based recommender
paradigm faces two limitations: 1) the human-generated items in the corpus
might fail to satisfy the users' diverse information needs, and 2) users
usually adjust the recommendations via inefficient passive feedback, e.g.,
clicks. Nowadays, AI-Generated Content (AIGC) has revealed significant success,
offering the potential to overcome these limitations: 1) generative AI can
produce personalized items to satisfy users' information needs, and 2) the
newly emerged large language models significantly reduce the efforts of users
to precisely express information needs via natural language instructions. In
this light, the boom of AIGC points the way towards the next-generation
recommender paradigm with two new objectives: 1) generating personalized
content through generative AI, and 2) integrating user instructions to guide
content generation.
To this end, we propose a novel Generative Recommender paradigm named
GeneRec, which adopts an AI generator to personalize content generation and
leverages user instructions. Specifically, we pre-process users' instructions
and traditional feedback via an instructor to output the generation guidance.
Given the guidance, we instantiate the AI generator through an AI editor and an
AI creator to repurpose existing items and create new items. Eventually,
GeneRec can perform content retrieval, repurposing, and creation to satisfy
users' information needs. Besides, to ensure the trustworthiness of the
generated items, we emphasize various fidelity checks. Moreover, we provide a
roadmap to envision future developments of GeneRec and several domain-specific
applications of GeneRec with potential research tasks. Lastly, we study the
feasibility of implementing AI editor and AI creator on micro-video generation
Dual Contrastive Network for Sequential Recommendation with User and Item-Centric Perspectives
With the outbreak of today's streaming data, sequential recommendation is a
promising solution to achieve time-aware personalized modeling. It aims to
infer the next interacted item of given user based on history item sequence.
Some recent works tend to improve the sequential recommendation via randomly
masking on the history item so as to generate self-supervised signals. But such
approach will indeed result in sparser item sequence and unreliable signals.
Besides, the existing sequential recommendation is only user-centric, i.e.,
based on the historical items by chronological order to predict the probability
of candidate items, which ignores whether the items from a provider can be
successfully recommended. The such user-centric recommendation will make it
impossible for the provider to expose their new items and result in popular
bias.
In this paper, we propose a novel Dual Contrastive Network (DCN) to generate
ground-truth self-supervised signals for sequential recommendation by auxiliary
user-sequence from item-centric perspective. Specifically, we propose dual
representation contrastive learning to refine the representation learning by
minimizing the euclidean distance between the representations of given
user/item and history items/users of them. Before the second contrastive
learning module, we perform next user prediction to to capture the trends of
items preferred by certain types of users and provide personalized exploration
opportunities for item providers. Finally, we further propose dual interest
contrastive learning to self-supervise the dynamic interest from next item/user
prediction and static interest of matching probability. Experiments on four
benchmark datasets verify the effectiveness of our proposed method. Further
ablation study also illustrates the boosting effect of the proposed components
upon different sequential models.Comment: 23 page
Formalizing Multimedia Recommendation through Multimodal Deep Learning
Recommender systems (RSs) offer personalized navigation experiences on online
platforms, but recommendation remains a challenging task, particularly in
specific scenarios and domains. Multimodality can help tap into richer
information sources and construct more refined user/item profiles for
recommendations. However, existing literature lacks a shared and universal
schema for modeling and solving the recommendation problem through the lens of
multimodality. This work aims to formalize a general multimodal schema for
multimedia recommendation. It provides a comprehensive literature review of
multimodal approaches for multimedia recommendation from the last eight years,
outlines the theoretical foundations of a multimodal pipeline, and demonstrates
its rationale by applying it to selected state-of-the-art approaches. The work
also conducts a benchmarking analysis of recent algorithms for multimedia
recommendation within Elliot, a rigorous framework for evaluating recommender
systems. The main aim is to provide guidelines for designing and implementing
the next generation of multimodal approaches in multimedia recommendation
Machine Learning Models for Educational Platforms
Scaling up education online and onlife is presenting numerous key challenges, such as hardly manageable classes, overwhelming content alternatives, and academic dishonesty while interacting remotely. However, thanks to the wider availability of learning-related data and increasingly higher performance computing, Artificial Intelligence has the potential to turn such challenges into an unparalleled opportunity. One of its sub-fields, namely Machine Learning, is enabling machines to receive data and learn for themselves, without being programmed with rules. Bringing this intelligent support to education at large scale has a number of advantages, such as avoiding manual error-prone tasks and reducing the chance that learners do any misconduct. Planning, collecting, developing, and predicting become essential steps to make it concrete into real-world education.
This thesis deals with the design, implementation, and evaluation of Machine Learning models in the context of online educational platforms deployed at large scale. Constructing and assessing the performance of intelligent models is a crucial step towards increasing reliability and convenience of such an educational medium. The contributions result in large data sets and high-performing models that capitalize on Natural Language Processing, Human Behavior Mining, and Machine Perception. The model decisions aim to support stakeholders over the instructional pipeline, specifically on content categorization, content recommendation, learners’ identity verification, and learners’ sentiment analysis. Past research in this field often relied on statistical processes hardly applicable at large scale. Through our studies, we explore opportunities and challenges introduced by Machine Learning for the above goals, a relevant and timely topic in literature.
Supported by extensive experiments, our work reveals a clear opportunity in combining human and machine sensing for researchers interested in online education. Our findings illustrate the feasibility of designing and assessing Machine Learning models for categorization, recommendation, authentication, and sentiment prediction in this research area. Our results provide guidelines on model motivation, data collection, model design, and analysis techniques concerning the above applicative scenarios. Researchers can use our findings to improve data collection on educational platforms, to reduce bias in data and models, to increase model effectiveness, and to increase the reliability of their models, among others. We expect that this thesis can support the adoption of Machine Learning models in educational platforms even more, strengthening the role of data as a precious asset. The thesis outputs are publicly available at https://www.mirkomarras.com
- …