20 research outputs found
Graph Masked Autoencoder for Sequential Recommendation
While some powerful neural network architectures (e.g., Transformer, Graph
Neural Networks) have achieved improved performance in sequential
recommendation with high-order item dependency modeling, they may suffer from
poor representation capability in label scarcity scenarios. To address the
issue of insufficient labels, Contrastive Learning (CL) has attracted much
attention in recent methods to perform data augmentation through embedding
contrasting for self-supervision. However, due to the hand-crafted property of
their contrastive view generation strategies, existing CL-enhanced models i)
can hardly yield consistent performance on diverse sequential recommendation
tasks; ii) may not be immune to user behavior data noise. In light of this, we
propose a simple yet effective Graph Masked AutoEncoder-enhanced sequential
Recommender system (MAERec) that adaptively and dynamically distills global
item transitional information for self-supervised augmentation. It naturally
avoids the above issue of heavy reliance on constructing high-quality embedding
contrastive views. Instead, an adaptive data reconstruction paradigm is
designed to be integrated with the long-range item dependency modeling, for
informative augmentation in sequential recommendation. Extensive experiments
demonstrate that our method significantly outperforms state-of-the-art baseline
models and can learn more accurate representations against data noise and
sparsity. Our implemented model code is available at
https://github.com/HKUDS/MAERec.Comment: This paper has been published as a full paper at SIGIR 202
Adaptive Graph Contrastive Learning for Recommendation
Graph neural networks (GNNs) have recently emerged as an effective
collaborative filtering (CF) approaches for recommender systems. The key idea
of GNN-based recommender systems is to recursively perform message passing
along user-item interaction edges to refine encoded embeddings, relying on
sufficient and high-quality training data. However, user behavior data in
practical recommendation scenarios is often noisy and exhibits skewed
distribution. To address these issues, some recommendation approaches, such as
SGL, leverage self-supervised learning to improve user representations. These
approaches conduct self-supervised learning through creating contrastive views,
but they depend on the tedious trial-and-error selection of augmentation
methods. In this paper, we propose a novel Adaptive Graph Contrastive Learning
(AdaGCL) framework that conducts data augmentation with two adaptive
contrastive view generators to better empower the CF paradigm. Specifically, we
use two trainable view generators - a graph generative model and a graph
denoising model - to create adaptive contrastive views. With two adaptive
contrastive views, AdaGCL introduces additional high-quality training signals
into the CF paradigm, helping to alleviate data sparsity and noise issues.
Extensive experiments on three real-world datasets demonstrate the superiority
of our model over various state-of-the-art recommendation methods. Our model
implementation codes are available at the link https://github.com/HKUDS/AdaGCL
Disentangled Graph Social Recommendation
Social recommender systems have drawn a lot of attention in many online web
services, because of the incorporation of social information between users in
improving recommendation results. Despite the significant progress made by
existing solutions, we argue that current methods fall short in two
limitations: (1) Existing social-aware recommendation models only consider
collaborative similarity between items, how to incorporate item-wise semantic
relatedness is less explored in current recommendation paradigms. (2) Current
social recommender systems neglect the entanglement of the latent factors over
heterogeneous relations (e.g., social connections, user-item interactions).
Learning the disentangled representations with relation heterogeneity poses
great challenge for social recommendation. In this work, we design a
Disentangled Graph Neural Network (DGNN) with the integration of latent memory
units, which empowers DGNN to maintain factorized representations for
heterogeneous types of user and item connections. Additionally, we devise new
memory-augmented message propagation and aggregation schemes under the graph
neural architecture, allowing us to recursively distill semantic relatedness
into the representations of users and items in a fully automatic manner.
Extensive experiments on three benchmark datasets verify the effectiveness of
our model by achieving great improvement over state-of-the-art recommendation
techniques. The source code is publicly available at:
https://github.com/HKUDS/DGNN.Comment: Accepted by IEEE ICDE 202
Leveraging Sequential Episode Mining for Session-Based News Recommendation
News recommender systems aim to help users find interesting and relevant news stories while mitigating information overload. Over the past few decades, various challenges have emerged in developing effective algorithms for real-world scenarios. These challenges include capturing evolving user preferences and addressing concept drift during reading sessions. Additionally, ensuring the freshness and timeliness of news content poses significant obstacles. To address these issues, we utilize an innovative sequential pattern mining approach known as Marbles to capture user behavior. Marbles leverages frequent episodes to generate a collection of association rules, where a frequent episode is a partially ordered pattern that occurs frequently in the input sequence. The recommendation process involves identifying relevant rules extracted from these patterns and weighting them. Subsequently, a heuristic procedure assesses candidate rules and generates a list of recommendations for users based on their most recent reading session. Notably, we conduct our evaluation in a streaming scenario, simulating real-world usage, where both our algorithm and baselines dynamically improve their models with each user click. Through our empirical evaluation in this streaming-based scenario, which closely models real-world usage, we demonstrate the applicability of the Marbles algorithm in session-based recommendation. Our proposed approach outperforms baseline algorithms on two real-world data sets, effectively addressing the challenges specific to the news domain.</p
Graph Transformer for Recommendation
This paper presents a novel approach to representation learning in
recommender systems by integrating generative self-supervised learning with
graph transformer architecture. We highlight the importance of high-quality
data augmentation with relevant self-supervised pretext tasks for improving
performance. Towards this end, we propose a new approach that automates the
self-supervision augmentation process through a rationale-aware generative SSL
that distills informative user-item interaction patterns. The proposed
recommender with Graph TransFormer (GFormer) that offers parameterized
collaborative rationale discovery for selective augmentation while preserving
global-aware user-item relationships. In GFormer, we allow the rationale-aware
SSL to inspire graph collaborative filtering with task-adaptive invariant
rationalization in graph transformer. The experimental results reveal that our
GFormer has the capability to consistently improve the performance over
baselines on different datasets. Several in-depth experiments further
investigate the invariant rationale-aware augmentation from various aspects.
The source code for this work is publicly available at:
https://github.com/HKUDS/GFormer.Comment: Accepted by SIGIR'202
Timestamps as Prompts for Geography-Aware Location Recommendation
Location recommendation plays a vital role in improving users' travel
experience. The timestamp of the POI to be predicted is of great significance,
since a user will go to different places at different times. However, most
existing methods either do not use this kind of temporal information, or just
implicitly fuse it with other contextual information. In this paper, we revisit
the problem of location recommendation and point out that explicitly modeling
temporal information is a great help when the model needs to predict not only
the next location but also further locations. In addition, state-of-the-art
methods do not make effective use of geographic information and suffer from the
hard boundary problem when encoding geographic information by gridding. To this
end, a Temporal Prompt-based and Geography-aware (TPG) framework is proposed.
The temporal prompt is firstly designed to incorporate temporal information of
any further check-in. A shifted window mechanism is then devised to augment
geographic data for addressing the hard boundary problem. Via extensive
comparisons with existing methods and ablation studies on five real-world
datasets, we demonstrate the effectiveness and superiority of the proposed
method under various settings. Most importantly, our proposed model has the
superior ability of interval prediction. In particular, the model can predict
the location that a user wants to go to at a certain time while the most recent
check-in behavioral data is masked, or it can predict specific future check-in
(not just the next one) at a given timestamp
Exploring and linking biomedical resources through multidimensional semantic spaces
Background
The semantic integration of biomedical resources is still a challenging issue which is required for effective information processing and data analysis. The availability of comprehensive knowledge resources such as biomedical ontologies and integrated thesauri greatly facilitates this integration effort by means of semantic annotation, which allows disparate data formats and contents to be expressed under a common semantic space. In this paper, we propose a multidimensional representation for such a semantic space, where dimensions regard the different perspectives in biomedical research (e.g., population, disease, anatomy and protein/genes).
Results
This paper presents a novel method for building multidimensional semantic spaces from semantically annotated biomedical data collections. This method consists of two main processes: knowledge and data normalization. The former one arranges the concepts provided by a reference knowledge resource (e.g., biomedical ontologies and thesauri) into a set of hierarchical dimensions for analysis purposes. The latter one reduces the annotation set associated to each collection item into a set of points of the multidimensional space. Additionally, we have developed a visual tool, called 3D-Browser, which implements OLAP-like operators over the generated multidimensional space. The method and the tool have been tested and evaluated in the context of the Health-e-Child (HeC) project. Automatic semantic annotation was applied to tag three collections of abstracts taken from PubMed, one for each target disease of the project, the Uniprot database, and the HeC patient record database. We adopted the UMLS Meta-thesaurus 2010AA as the reference knowledge resource.
Conclusions
Current knowledge resources and semantic-aware technology make possible the integration of biomedical resources. Such an integration is performed through semantic annotation of the intended biomedical data resources. This paper shows how these annotations can be exploited for integration, exploration, and analysis tasks. Results over a real scenario demonstrate the viability and usefulness of the approach, as well as the quality of the generated multidimensional semantic spaces
Finding Semantically Related Videos in Closed Collections
Modern newsroom tools offer advanced functionality for automatic and semi-automatic content collection from the web and social media sources to accompany news stories. However, the content collected in this way often tends to be unstructured and may include irrelevant items. An important step in the verification process is to organize this content, both with respect to what it shows, and with respect to its origin. This chapter presents our efforts in this direction, which resulted in two components. One aims to detect semantic concepts in video shots, to help annotation and organization of content collections. We implement a system based on deep learning, featuring a number of advances and adaptations of existing algorithms to increase performance for the task. The other component aims to detect logos in videos in order to identify their provenance. We present our progress from a keypoint-based detection system to a system based on deep learning
Système neuronal pour réponses à des questions de compréhension de scène auditives
Le présent projet introduit la tâche "réponse à des questions à contenu auditif" (Acoustic Question Answering-AQA) dans laquelle un agent intelligent doit répondre à une question sur le contenu d'une scène auditive. Dans un premier temps, une base de donnée (CLEAR) comprenant des scènes auditives ainsi que des paires question-réponse pour chacune d'elles est mise sur pied afin de permettre l'entraînement de systèmes à base de neurones. Cette tâche étant analogue à la tâche "réponse à des questions à contenu visuel" (Visual Question Answering-VQA), une étude préliminaire est réalisé en utilisant un réseau de neurones (FiLM) initialement développé pour la tâche VQA. Les scènes auditives sont d'abord transformées en représentation spectro-temporelle afin d'être traitées comme des images par le réseau FiLM. Cette étude a pour but de quantifier la performance d'un système initialement conçu pour des scènes visuelles dans un contexte acoustique. Dans la même lignée, une étude de l'efficacité de la technique visuelle de cartes de coordonnées convolutives (CoordConv) lorsqu'appliquée dans un contexte acoustique est réalisée. Finalement, un nouveau réseau de neurones adapté au contexte acoustique (NAAQA) est introduit.
NAAQA obtient de meilleures performances que FiLM sur la base de donnée CLEAR tout en étant environ 7 fois moins complexe