616 research outputs found
Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text
Collaborative filtering (CF) is the key technique for recommender systems
(RSs). CF exploits user-item behavior interactions (e.g., clicks) only and
hence suffers from the data sparsity issue. One research thread is to integrate
auxiliary information such as product reviews and news titles, leading to
hybrid filtering methods. Another thread is to transfer knowledge from other
source domains such as improving the movie recommendation with the knowledge
from the book domain, leading to transfer learning methods. In real-world life,
no single service can satisfy a user's all information needs. Thus it motivates
us to exploit both auxiliary and source information for RSs in this paper. We
propose a novel neural model to smoothly enable Transfer Meeting Hybrid (TMH)
methods for cross-domain recommendation with unstructured text in an end-to-end
manner. TMH attentively extracts useful content from unstructured text via a
memory module and selectively transfers knowledge from a source domain via a
transfer network. On two real-world datasets, TMH shows better performance in
terms of three ranking metrics by comparing with various baselines. We conduct
thorough analyses to understand how the text content and transferred knowledge
help the proposed model.Comment: 11 pages, 7 figures, a full version for the WWW 2019 short pape
Cross-lingual Offensive Language Detection: A Systematic Review of Datasets, Transfer Approaches and Challenges
The growing prevalence and rapid evolution of offensive language in social
media amplify the complexities of detection, particularly highlighting the
challenges in identifying such content across diverse languages. This survey
presents a systematic and comprehensive exploration of Cross-Lingual Transfer
Learning (CLTL) techniques in offensive language detection in social media. Our
study stands as the first holistic overview to focus exclusively on the
cross-lingual scenario in this domain. We analyse 67 relevant papers and
categorise these studies across various dimensions, including the
characteristics of multilingual datasets used, the cross-lingual resources
employed, and the specific CLTL strategies implemented. According to "what to
transfer", we also summarise three main CLTL transfer approaches: instance,
feature, and parameter transfer. Additionally, we shed light on the current
challenges and future research opportunities in this field. Furthermore, we
have made our survey resources available online, including two comprehensive
tables that provide accessible references to the multilingual datasets and CLTL
methods used in the reviewed literature.Comment: 35 pages, 7 figure
Algorithms, applications and systems towards interpretable pattern mining from multi-aspect data
How do humans move around in the urban space and how do they differ when the city undergoes terrorist attacks? How do users behave in Massive Open Online courses~(MOOCs) and how do they differ if some of them achieve certificates while some of them not? What areas in the court elite players, such as Stephen Curry, LeBron James, like to make their shots in the course of the game? How can we uncover the hidden habits that govern our online purchases? Are there unspoken agendas in how different states pass legislation of certain kinds? At the heart of these seemingly unconnected puzzles is this same mystery of multi-aspect mining, i.g., how can we mine and interpret the hidden pattern from a dataset that simultaneously reveals the associations, or changes of the associations, among various aspects of the data (e.g., a shot could be described with three aspects, player, time of the game, and area in the court)? Solving this problem could open gates to a deep understanding of underlying mechanisms for many real-world phenomena. While much of the research in multi-aspect mining contribute broad scope of innovations in the mining part, interpretation of patterns from the perspective of users (or domain experts) is often overlooked. Questions like what do they require for patterns, how good are the patterns, or how to read them, have barely been addressed. Without efficient and effective ways of involving users in the process of multi-aspect mining, the results are likely to lead to something difficult for them to comprehend.
This dissertation proposes the M^3 framework, which consists of multiplex pattern discovery, multifaceted pattern evaluation, and multipurpose pattern presentation, to tackle the challenges of multi-aspect pattern discovery. Based on this framework, we develop algorithms, applications, and analytic systems to enable interpretable pattern discovery from multi-aspect data. Following the concept of meaningful multiplex pattern discovery, we propose PairFac to close the gap between human information needs and naive mining optimization. We demonstrate its effectiveness in the context of impact discovery in the aftermath of urban disasters. We develop iDisc to target the crossing of multiplex pattern discovery with multifaceted pattern evaluation. iDisc meets the specific information need in understanding multi-level, contrastive behavior patterns. As an example, we use iDisc to predict student performance outcomes in Massive Open Online Courses given users' latent behaviors. FacIt is an interactive visual analytic system that sits at the intersection of all three components and enables for interpretable, fine-tunable, and scrutinizable pattern discovery from multi-aspect data. We demonstrate each work's significance and implications in its respective problem context. As a whole, this series of studies is an effort to instantiate the M^3 framework and push the field of multi-aspect mining towards a more human-centric process in real-world applications
A Survey of Graph Neural Networks for Social Recommender Systems
Social recommender systems (SocialRS) simultaneously leverage user-to-item
interactions as well as user-to-user social relations for the task of
generating item recommendations to users. Additionally exploiting social
relations is clearly effective in understanding users' tastes due to the
effects of homophily and social influence. For this reason, SocialRS has
increasingly attracted attention. In particular, with the advance of Graph
Neural Networks (GNN), many GNN-based SocialRS methods have been developed
recently. Therefore, we conduct a comprehensive and systematic review of the
literature on GNN-based SocialRS. In this survey, we first identify 80 papers
on GNN-based SocialRS after annotating 2151 papers by following the PRISMA
framework (Preferred Reporting Items for Systematic Reviews and Meta-Analysis).
Then, we comprehensively review them in terms of their inputs and architectures
to propose a novel taxonomy: (1) input taxonomy includes 5 groups of input type
notations and 7 groups of input representation notations; (2) architecture
taxonomy includes 8 groups of GNN encoder, 2 groups of decoder, and 12 groups
of loss function notations. We classify the GNN-based SocialRS methods into
several categories as per the taxonomy and describe their details. Furthermore,
we summarize the benchmark datasets and metrics widely used to evaluate the
GNN-based SocialRS methods. Finally, we conclude this survey by presenting some
future research directions.Comment: GitHub repository with the curated list of papers:
https://github.com/claws-lab/awesome-GNN-social-recsy
Network Analysis on Incomplete Structures.
Over the past decade, networks have become an increasingly popular abstraction for problems in the physical, life, social and information sciences. Network analysis can be used to extract insights into an underlying system from the structure of its network representation. One of the challenges of applying network analysis is the fact that networks do not always have an observed and complete structure. This dissertation focuses on the problem of imputation and/or inference in the presence of incomplete network structures. I propose four novel systems, each of which, contain a module that involves the inference or imputation of an incomplete network that is necessary to complete the end task.
I first propose EdgeBoost, a meta-algorithm and framework that repeatedly applies a non-deterministic link predictor to improve the efficacy of community detection algorithms on networks with missing edges. On average EdgeBoost improves performance of existing algorithms by 7% on artificial data and 17% on ego networks collected from Facebook. The second system, Butterworth, identifies a social network user's topic(s) of interests and automatically generates a set of social feed ``rankers'' that enable the user to see topic specific sub-feeds. Butterworth uses link prediction to infer the missing semantics between members of a user's social network in order to detect topical clusters embedded in the network structure. For automatically generated topic lists, Butterworth achieves an average top-10 precision of 78%, as compared to a time-ordered baseline of 45%. Next, I propose Dobby, a system for constructing a knowledge graph of user-defined keyword tags. Leveraging a sparse set of labeled edges, Dobby trains a supervised learning algorithm to infer the hypernym relationships between keyword tags. Dobby was evaluated by constructing a knowledge graph of LinkedIn's skills dataset, achieving an average precision of 85% on a set of human labeled hypernym edges between skills. Lastly, I propose Lobbyback, a system that automatically identifies clusters of documents that exhibit text reuse and generates ``prototypes'' that represent a canonical version of text shared between the documents. Lobbyback infers a network structure in a corpus of documents and uses community detection in order to extract the document clusters.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133443/1/mattburg_1.pd
- …