194 research outputs found
A Survey on Bayesian Deep Learning
A comprehensive artificial intelligence system needs to not only perceive the
environment with different `senses' (e.g., seeing and hearing) but also infer
the world's conditional (or even causal) relations and corresponding
uncertainty. The past decade has seen major advances in many perception tasks
such as visual object recognition and speech recognition using deep learning
models. For higher-level inference, however, probabilistic graphical models
with their Bayesian nature are still more powerful and flexible. In recent
years, Bayesian deep learning has emerged as a unified probabilistic framework
to tightly integrate deep learning and Bayesian models. In this general
framework, the perception of text or images using deep learning can boost the
performance of higher-level inference and in turn, the feedback from the
inference process is able to enhance the perception of text or images. This
survey provides a comprehensive introduction to Bayesian deep learning and
reviews its recent applications on recommender systems, topic models, control,
etc. Besides, we also discuss the relationship and differences between Bayesian
deep learning and other related topics such as Bayesian treatment of neural
networks.Comment: To appear in ACM Computing Surveys (CSUR) 202
Criteria Tell You More than Ratings: Criteria Preference-Aware Light Graph Convolution for Effective Multi-Criteria Recommendation
The multi-criteria (MC) recommender system, which leverages MC rating
information in a wide range of e-commerce areas, is ubiquitous nowadays.
Surprisingly, although graph neural networks (GNNs) have been widely applied to
develop various recommender systems due to GNN's high expressive capability in
learning graph representations, it has been still unexplored how to design MC
recommender systems with GNNs. In light of this, we make the first attempt
towards designing a GNN-aided MC recommender system. Specifically, rather than
straightforwardly adopting existing GNN-based recommendation methods, we devise
a novel criteria preference-aware light graph convolution CPA-LGC method, which
is capable of precisely capturing the criteria preference of users as well as
the collaborative signal in complex high-order connectivities. To this end, we
first construct an MC expansion graph that transforms user--item MC ratings
into an expanded bipartite graph to potentially learn from the collaborative
signal in MC ratings. Next, to strengthen the capability of criteria preference
awareness, CPA-LGC incorporates newly characterized embeddings, including
user-specific criteria-preference embeddings and item-specific criterion
embeddings, into our graph convolution model. Through comprehensive evaluations
using four real-world datasets, we demonstrate (a) the superiority over
benchmark MC recommendation methods and benchmark recommendation methods using
GNNs with tremendous gains, (b) the effectiveness of core components in
CPA-LGC, and (c) the computational efficiency.Comment: 12 pages, 10 figures, 5 tables; 29th ACM SIGKDD Conference on
Knowledge Discovery & Data (KDD 2023) (to appear) (Please cite our conference
version.
Practical recommendations for gradient-based training of deep architectures
Learning algorithms related to artificial neural networks and in particular
for Deep Learning may seem to involve many bells and whistles, called
hyper-parameters. This chapter is meant as a practical guide with
recommendations for some of the most commonly used hyper-parameters, in
particular in the context of learning algorithms based on back-propagated
gradient and gradient-based optimization. It also discusses how to deal with
the fact that more interesting results can be obtained when allowing one to
adjust many hyper-parameters. Overall, it describes elements of the practice
used to successfully and efficiently train and debug large-scale and often deep
multi-layer neural networks. It closes with open questions about the training
difficulties observed with deeper architectures
Joint Neural Collaborative Filtering for Recommender Systems
We propose a J-NCF method for recommender systems. The J-NCF model applies a
joint neural network that couples deep feature learning and deep interaction
modeling with a rating matrix. Deep feature learning extracts feature
representations of users and items with a deep learning architecture based on a
user-item rating matrix. Deep interaction modeling captures non-linear
user-item interactions with a deep neural network using the feature
representations generated by the deep feature learning process as input. J-NCF
enables the deep feature learning and deep interaction modeling processes to
optimize each other through joint training, which leads to improved
recommendation performance. In addition, we design a new loss function for
optimization, which takes both implicit and explicit feedback, point-wise and
pair-wise loss into account. Experiments on several real-word datasets show
significant improvements of J-NCF over state-of-the-art methods, with
improvements of up to 8.24% on the MovieLens 100K dataset, 10.81% on the
MovieLens 1M dataset, and 10.21% on the Amazon Movies dataset in terms of
HR@10. NDCG@10 improvements are 12.42%, 14.24% and 15.06%, respectively. We
also conduct experiments to evaluate the scalability and sensitivity of J-NCF.
Our experiments show that the J-NCF model has a competitive recommendation
performance with inactive users and different degrees of data sparsity when
compared to state-of-the-art baselines.Comment: 30 page
Explainable Neural Attention Recommender Systems
Recommender systems, predictive models that provide lists of personalized suggestions, have become increasingly popular in many web-based businesses. By presenting potential items that may interest a user, these systems are able to better monetize and improve users’ satisfaction. In recent years, the most successful approaches rely on capturing what best define users and items in the form of latent vectors, a numeric representation that assumes all instances can be described by their respective affiliation towards a set of hidden features. However, recommendation methods based on latent features still face some realworld limitations. The data sparsity problem originates from the unprecedented variety of available items, making generated suggestions irrelevant to many users. Furthermore, many systems have been recently expected to accompany their suggestions with corresponding reasoning. Users who receive unjustified recommendations they do not agree with are susceptible to stop using the system or ignore its suggestions. In this work we investigate the current trends in the field of recommender systems and focus on two rising areas, deep recommendation and explainable recommender systems. First we present Textual and Contextual Embedding-based Neural Recommender (TCENR), a model that mitigates the data sparsity problem in the area of point-of-interest (POI) recommendation. This method employs different types of deep neural networks to learn varied perspectives of the same user-location interaction, using textual reviews, geographical data and social networks
Apprentissage de représentation pour des données générées par des utilisateurs
In this thesis, we study how representation learning methods can be applied to user-generated data. Our contributions cover three different applications but share a common denominator: the extraction of relevant user representations. Our first application is the item recommendation task, where recommender systems build user and item profiles out of past ratings reflecting user preferences and item characteristics. Nowadays, textual information is often together with ratings available and we propose to use it to enrich the profiles extracted from the ratings. Our hope is to extract from the textual content shared opinions and preferences. The models we propose provide another opportunity: predicting the text a user would write on an item. Our second application is sentiment analysis and, in particular, polarity classification. Our idea is that recommender systems can be used for such a task. Recommender systems and traditional polarity classifiers operate on different time scales. We propose two hybridizations of these models: the former has better classification performance, the latter highlights a vocabulary of surprise in the texts of the reviews. The third and final application we consider is urban mobility. It takes place beyond the frontiers of the Internet, in the physical world. Using authentication logs of the subway users, logging the time and station at which users take the subway, we show that it is possible to extract robust temporal profiles.Dans cette thèse, nous étudions comment les méthodes d'apprentissage de représentations peuvent être appliquées à des données générées par l'utilisateur. Nos contributions couvrent trois applications différentes, mais partagent un dénominateur commun: l'extraction des représentations d'utilisateurs concernés. Notre première application est la tâche de recommandation de produits, où les systèmes existant créent des profils utilisateurs et objets qui reflètent les préférences des premiers et les caractéristiques des derniers, en utilisant l'historique. De nos jours, un texte accompagne souvent cette note et nous proposons de l'utiliser pour enrichir les profils extraits. Notre espoir est d'en extraire une connaissance plus fine des goûts des utilisateurs. Nous pouvons, en utilisant ces modèles, prédire le texte qu'un utilisateur va écrire sur un objet. Notre deuxième application est l'analyse des sentiments et, en particulier, la classification de polarité. Notre idée est que les systèmes de recommandation peuvent être utilisés pour une telle tâche. Les systèmes de recommandation et classificateurs de polarité traditionnels fonctionnent sur différentes échelles de temps. Nous proposons deux hybridations de ces modèles: la première a de meilleures performances en classification, la seconde exhibe un vocabulaire de surprise. La troisième et dernière application que nous considérons est la mobilité urbaine. Elle a lieu au-delà des frontières d'Internet, dans le monde physique. Nous utilisons les journaux d'authentification des usagers du métro, enregistrant l'heure et la station d'origine des trajets, pour caractériser les utilisateurs par ses usages et habitudes temporelles
- …