Search CORE

83,051 research outputs found

Disentangled Graph Social Recommendation

Author: Huang Chao
Pei Jian
Shao Yizhen
Xia Lianghao
Xu Huance
Xu Yong
Publication venue
Publication date: 14/03/2023
Field of study

Social recommender systems have drawn a lot of attention in many online web services, because of the incorporation of social information between users in improving recommendation results. Despite the significant progress made by existing solutions, we argue that current methods fall short in two limitations: (1) Existing social-aware recommendation models only consider collaborative similarity between items, how to incorporate item-wise semantic relatedness is less explored in current recommendation paradigms. (2) Current social recommender systems neglect the entanglement of the latent factors over heterogeneous relations (e.g., social connections, user-item interactions). Learning the disentangled representations with relation heterogeneity poses great challenge for social recommendation. In this work, we design a Disentangled Graph Neural Network (DGNN) with the integration of latent memory units, which empowers DGNN to maintain factorized representations for heterogeneous types of user and item connections. Additionally, we devise new memory-augmented message propagation and aggregation schemes under the graph neural architecture, allowing us to recursively distill semantic relatedness into the representations of users and items in a fully automatic manner. Extensive experiments on three benchmark datasets verify the effectiveness of our model by achieving great improvement over state-of-the-art recommendation techniques. The source code is publicly available at: https://github.com/HKUDS/DGNN.Comment: Accepted by IEEE ICDE 202

arXiv.org e-Print Archive

SgWalk: Location Recommendation by User Subgraph-Based Graph Embedding

Author: Canturk Deniz
Karagöz Pınar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Popularity of Location-based Social Networks (LBSNs) provides an opportunity to collect massive multi-modal datasets that contain geographical information, as well as time and social interactions. Such data is a useful resource for generating personalized location recommendations. Such heterogeneous data can be further extended with notions of trust between users, the popularity of locations, and the expertise of users. Recently the use of Heterogeneous Information Network (HIN) models and graph neural architectures have proven successful for recommendation problems. One limitation of such a solution is capturing the contextual relationships between the nodes in the heterogeneous network. In location recommendation, spatial context is a frequently used consideration such that users prefer to get recommendations within their spatial vicinity. To solve this challenging problem, we propose a novel Heterogeneous Information Network (HIN) embedding technique, SgWalk, which explores the proximity between users and locations and generates location recommendations via subgraph-based node embedding. SgWalk follows four steps: building users subgraphs according to location context, generating random walk sequences over user subgraphs, learning embeddings of nodes in LBSN graph, and generating location recommendations using vector representation of the nodes. SgWalk is differentiated from existing techniques relying on meta-path or bi-partite graphs by means of utilizing the contextual user subgraph. In this way, it is aimed to capture contextual relationships among heterogeneous nodes more effectively. The recommendation accuracy of SgWalk is analyzed through extensive experiments conducted on benchmark datasets in terms of top-n location recommendations. The accuracy evaluation results indicate minimum 23% (@5 recommendation) average improvement in accuracy compared to baseline techniques and the state-of-the-art heterogeneous graph embedding techniques in the literature

Directory of Open Access Journals

OpenMETU (Middle East Technical University)

TwHIN-BERT: A Socially-Enriched Pre-trained Language Model for Multilingual Tweet Representations

Author: El-Kishky Ahmed
Florez Omar
Han Jiawei
Malkov Yury
McWilliams Brian
Park Serim
Zhang Xinyang
Publication venue
Publication date: 15/09/2022
Field of study

We present TwHIN-BERT, a multilingual language model trained on in-domain data from the popular social network Twitter. TwHIN-BERT differs from prior pre-trained language models as it is trained with not only text-based self-supervision, but also with a social objective based on the rich social engagements within a Twitter heterogeneous information network (TwHIN). Our model is trained on 7 billion tweets covering over 100 distinct languages providing a valuable representation to model short, noisy, user-generated text. We evaluate our model on a variety of multilingual social recommendation and semantic understanding tasks and demonstrate significant metric improvement over established pre-trained language models. We will freely open-source TwHIN-BERT and our curated hashtag prediction and social engagement benchmark datasets to the research community

arXiv.org e-Print Archive

Source-Aware Embedding Training on Heterogeneous Information Networks

Author: Chan Tsai Hor
Shen Jiajun
Wong Chi Ho
Yin Guosheng
Publication venue: 'MIT Press - Journals'
Publication date: 10/07/2023
Field of study

Heterogeneous information networks (HINs) have been extensively applied to real-world tasks, such as recommendation systems, social networks, and citation networks. While existing HIN representation learning methods can effectively learn the semantic and structural features in the network, little awareness was given to the distribution discrepancy of subgraphs within a single HIN. However, we find that ignoring such distribution discrepancy among subgraphs from multiple sources would hinder the effectiveness of graph embedding learning algorithms. This motivates us to propose SUMSHINE (Scalable Unsupervised Multi-Source Heterogeneous Information Network Embedding) -- a scalable unsupervised framework to align the embedding distributions among multiple sources of an HIN. Experimental results on real-world datasets in a variety of downstream tasks validate the performance of our method over the state-of-the-art heterogeneous information network embedding algorithms.Comment: Published in Data Intelligence 202

arXiv.org e-Print Archive

Privacy Risk in Anonymized Heterogeneous Information Networks

Author: Aston Zhang
Carl A Gunter
Chuan Chang
Jiawei Han
Kevin Chen
Xiaofeng Wang
Xing Xie
Publication venue
Publication date: 24/04/2020
Field of study

ABSTRACT Anonymized user datasets are often released for research or industry applications. As an example, t.qq.com released its anonymized users' profile, social interaction, and recommendation log data in KDD Cup 2012 to call for recommendation algorithms. Since the entities (users and so on) and edges (links among entities) are of multiple types, the released social network is a heterogeneous information network. Prior work has shown how privacy can be compromised in homogeneous information networks by the use of specific types of graph patterns. We show how the extra information derived from heterogeneity can be used to relax these assumptions. To characterize and demonstrate this added threat, we formally define privacy risk in an anonymized heterogeneous information network to identify the vulnerability in the possible way such data are released, and further present a new de-anonymization attack that exploits the vulnerability. Our attack successfully de-anonymized most individuals involved in the data-for an anonymized 1,000-user t.qq.com network of density 0.01, the attack precision is over 90% with a 2.3-million-user auxiliary network

CiteSeerX

Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs

Author: Ajwani Deepak
Gatterbauer Wolfgang
Nicholson Patrick K.
Riedewald Mirek
Sala Alessandra
Yang Xiaofeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to nd the top-k matches ac- cording to a ranking function over edge and node weights. For users, it is di cult to select value k . We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, re- turn as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continues until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.Comment: To appear in WWW 201

arXiv.org e-Print Archive

Crossref

Mining and Analyzing the Academic Network

Author: Yang Zaihan
Publication venue: Lehigh Preserve
Publication date
Field of study

Social Network research has attracted the interests of many researchers, not only in analyzing the online social networking applications, such as Facebook and Twitter, but also in providing comprehensive services in scientific research domain. We define an Academic Network as a social network which integrates scientific factors, such as authors, papers, affiliations, publishing venues, and their relationships, such as co-authorship among authors and citations among papers. By mining and analyzing the academic network, we can provide users comprehensive services as searching for research experts, published papers, conferences, as well as detecting research communities or the evolutions hot research topics. We can also provide recommendations to users on with whom to collaborate, whom to cite and where to submit.In this dissertation, we investigate two main tasks that have fundamental applications in the academic network research. In the first, we address the problem of expertise retrieval, also known as expert finding or ranking, in which we identify and return a ranked list of researchers, based upon their estimated expertise or reputation, to user-specified queries. In the second, we address the problem of research action recommendation (prediction), specifically, the tasks of publishing venue recommendation, citation recommendation and coauthor recommendation. For both tasks, to effectively mine and integrate heterogeneous information and therefore develop well-functioning ranking or recommender systems is our principal goal. For the task of expertise retrieval, we first proposed or applied three modified versions of PageRank-like algorithms into citation network analysis; we then proposed an enhanced author-topic model by simultaneously modeling citation and publishing venue information; we finally incorporated the pair-wise learning-to-rank algorithm into traditional topic modeling process, and further improved the model by integrating groups of author-specific features. For the task of research action recommendation, we first proposed an improved neighborhood-based collaborative filtering approach for publishing venue recommendation; we then applied our proposed enhanced author-topic model and demonstrated its effectiveness in both cited author prediction and publishing venue prediction; finally we proposed an extended latent factor model that can jointly model several relations in an academic environment in a unified way and verified its performance in four recommendation tasks: the recommendation on author-co-authorship, author-paper citation, paper-paper citation and paper-venue submission. Extensive experiments conducted on large-scale real-world data sets demonstrated the superiority of our proposed models over other existing state-of-the-art methods

Lehigh University: Lehigh Preserve