121 research outputs found
Recommending on graphs: a comprehensive review from a data perspective
Recent advances in graph-based learning approaches have demonstrated their
effectiveness in modelling users' preferences and items' characteristics for
Recommender Systems (RSS). Most of the data in RSS can be organized into graphs
where various objects (e.g., users, items, and attributes) are explicitly or
implicitly connected and influence each other via various relations. Such a
graph-based organization brings benefits to exploiting potential properties in
graph learning (e.g., random walk and network embedding) techniques to enrich
the representations of the user and item nodes, which is an essential factor
for successful recommendations. In this paper, we provide a comprehensive
survey of Graph Learning-based Recommender Systems (GLRSs). Specifically, we
start from a data-driven perspective to systematically categorize various
graphs in GLRSs and analyze their characteristics. Then, we discuss the
state-of-the-art frameworks with a focus on the graph learning module and how
they address practical recommendation challenges such as scalability, fairness,
diversity, explainability and so on. Finally, we share some potential research
directions in this rapidly growing area.Comment: Accepted by UMUA
A Comprehensive Survey on Deep Graph Representation Learning
Graph representation learning aims to effectively encode high-dimensional
sparse graph-structured data into low-dimensional dense vectors, which is a
fundamental task that has been widely studied in a range of fields, including
machine learning and data mining. Classic graph embedding methods follow the
basic idea that the embedding vectors of interconnected nodes in the graph can
still maintain a relatively close distance, thereby preserving the structural
information between the nodes in the graph. However, this is sub-optimal due
to: (i) traditional methods have limited model capacity which limits the
learning performance; (ii) existing techniques typically rely on unsupervised
learning strategies and fail to couple with the latest learning paradigms;
(iii) representation learning and downstream tasks are dependent on each other
which should be jointly enhanced. With the remarkable success of deep learning,
deep graph representation learning has shown great potential and advantages
over shallow (traditional) methods, there exist a large number of deep graph
representation learning techniques have been proposed in the past decade,
especially graph neural networks. In this survey, we conduct a comprehensive
survey on current deep graph representation learning algorithms by proposing a
new taxonomy of existing state-of-the-art literature. Specifically, we
systematically summarize the essential components of graph representation
learning and categorize existing approaches by the ways of graph neural network
architectures and the most recent advanced learning paradigms. Moreover, this
survey also provides the practical and promising applications of deep graph
representation learning. Last but not least, we state new perspectives and
suggest challenging directions which deserve further investigations in the
future
Software expert discovery via knowledge domain embeddings in a collaborative network
© 2018 Elsevier B.V. Community Question Answering (CQA) websites can be claimed as the most major venues for knowledge sharing, and the most effective way of exchanging knowledge at present. Considering that massive amount of users are participating online and generating huge amount data, management of knowledge here systematically can be challenging. Expert recommendation is one of the major challenges, as it highlights users in CQA with potential expertise, which may help match unresolved questions with existing high quality answers while at the same time may help external services like human resource systems as another reference to evaluate their candidates. In this paper, we in this work we propose to exploring experts in CQA websites. We take advantage of recent distributed word representation technology to help summarize text chunks, and in a semantic view exploiting the relationships between natural language phrases to extract latent knowledge domains. By domains, the users’ expertise is determined on their historical performance, and a rank can be compute to given recommendation accordingly. In particular, Stack Overflow is chosen as our dataset to test and evaluate our work, where inclusive experiment shows our competence
Toward Sustainable Recommendation Systems
Recommendation systems are ubiquitous, acting as an essential component in online platforms to help users discover items of interest. For example, streaming services rely on recommendation systems to serve high-quality informational and entertaining content to their users, and e-commerce platforms recommend interesting items to assist customers in making shopping decisions. Further-more, the algorithms and frameworks driving recommendation systems provide the foundation for new personalized machine learning methods that have wide-ranging impacts.
While successful, many current recommendation systems are fundamentally not sustainable: they focus on short-lived engagement objectives, requiring constant fine-tuning to adapt to the dynamics of evolving systems, or are subject to performance degradation as users and items churn in the system. In this dissertation research, we seek to lay the foundations for a new class of sustainable recommendation systems. By sustainable, we mean a recommendation system should be fundamentally long-lived, while enhancing both current and future potential to connect users with interesting content. By building such sustainable recommendation systems, we can continuously improve the user experience and provide a long-lived foundation for ongoing engagement. Building on a large body of work in recommendation systems, with the advance in graph neural networks, and with recent success in meta-learning for ML-based models, this dissertation focuses on sustainability in recommendation systems from the following three perspectives with corresponding contributions:
• Adaptivity: The first contribution lies in capturing the temporal effects from the instant shifting of users’ preferences to the lifelong evolution of users and items in real-world scenarios, leading to models which are highly adaptive to the temporal dynamics present in online platforms and provide improved item recommendation at different timestamps.
• Resilience: Secondly, we seek to identify the elite users who act as the “backbone” recommendation systems shape the opinions of other users via their public activities. By investigating the correlation between user’s preference on item consumption and their connections to the “backbone”, we enable recommendation models to be resilient to dramatic changes including churn in new items and users, and frequently updated connections between users in online communities.
• Robustness: Finally, we explore the design of a novel framework for “learning-to-adapt” to the imperfect test cases in recommendation systems ranging from cold-start users with few interactions to casual users with low activity levels. Such a model is robust to the imperfection in real-world environments, resulting in reliable recommendation to meet user needs and aspirations
A Survey on Semantic Processing Techniques
Semantic processing is a fundamental research domain in computational
linguistics. In the era of powerful pre-trained language models and large
language models, the advancement of research in this domain appears to be
decelerating. However, the study of semantics is multi-dimensional in
linguistics. The research depth and breadth of computational semantic
processing can be largely improved with new technologies. In this survey, we
analyzed five semantic processing tasks, e.g., word sense disambiguation,
anaphora resolution, named entity recognition, concept extraction, and
subjectivity detection. We study relevant theoretical research in these fields,
advanced methods, and downstream applications. We connect the surveyed tasks
with downstream applications because this may inspire future scholars to fuse
these low-level semantic processing tasks with high-level natural language
processing tasks. The review of theoretical research may also inspire new tasks
and technologies in the semantic processing domain. Finally, we compare the
different semantic processing techniques and summarize their technical trends,
application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN
1566-2535. The equal contribution mark is missed in the published version due
to the publication policies. Please contact Prof. Erik Cambria for detail
A constraint-based hypergraph partitioning approach to coreference resolution
The objectives of this thesis are focused on research in machine learning for
coreference resolution. Coreference resolution is a natural language processing
task that consists of determining the expressions in a discourse that mention or
refer to the same entity.
The main contributions of this thesis are (i) a new approach to coreference
resolution based on constraint satisfaction, using a hypergraph to represent
the problem and solving it by relaxation labeling; and (ii) research towards
improving coreference resolution performance using world knowledge extracted
from Wikipedia.
The developed approach is able to use entity-mention classi cation model
with more expressiveness than the pair-based ones, and overcome the weaknesses
of previous approaches in the state of the art such as linking contradictions,
classi cations without context and lack of information evaluating pairs. Furthermore,
the approach allows the incorporation of new information by adding
constraints, and a research has been done in order to use world knowledge to
improve performances.
RelaxCor, the implementation of the approach, achieved results in the
state of the art, and participated in international competitions: SemEval-2010
and CoNLL-2011. RelaxCor achieved second position in CoNLL-2011.La resolució de correferències és una tasca de processament del llenguatge natural que consisteix en determinar les expressions
d'un discurs que es refereixen a la mateixa entitat del mon real. La tasca té un efecte directe en la minería de textos així com en
moltes tasques de llenguatge natural que requereixin interpretació del discurs com resumidors, responedors de preguntes o
traducció automàtica. Resoldre les correferències és essencial si es vol poder “entendre” un text o un discurs.
Els objectius d'aquesta tesi es centren en la recerca en resolució de correferències amb aprenentatge automàtic. Concretament,
els objectius de la recerca es centren en els següents camps:
+ Models de classificació: Els models de classificació més comuns a l'estat de l'art estan basats en la classificació independent de
parelles de mencions. Més recentment han aparegut models que classifiquen grups de mencions. Un dels objectius de la tesi és
incorporar el model entity-mention a l'aproximació desenvolupada.
+ Representació del problema: Encara no hi ha una representació definitiva del problema. En aquesta tesi es presenta una
representació en hypergraf.
+ Algorismes de resolució. Depenent de la representació del problema i del model de classificació, els algorismes de ressolució
poden ser molt diversos. Un dels objectius d'aquesta tesi és trobar un algorisme de resolució capaç d'utilitzar els models de
classificació en la representació d'hypergraf.
+ Representació del coneixement: Per poder administrar coneixement de diverses fonts, cal una representació simbòlica i
expressiva d'aquest coneixement. En aquesta tesi es proposa l'ús de restriccions.
+ Incorporació de coneixement del mon: Algunes correferències no es poden resoldre només amb informació lingüística. Sovint
cal sentit comú i coneixement del mon per poder resoldre coreferències. En aquesta tesi es proposa un mètode per extreure
coneixement del mon de Wikipedia i incorporar-lo al sistem de resolució.
Les contribucions principals d'aquesta tesi son (i) una nova aproximació al problema de resolució de correferències basada en
satisfacció de restriccions, fent servir un hypergraf per representar el problema, i resolent-ho amb l'algorisme relaxation labeling; i
(ii) una recerca per millorar els resultats afegint informació del mon extreta de la Wikipedia.
L'aproximació presentada pot fer servir els models mention-pair i entity-mention de forma combinada evitant així els problemes
que es troben moltes altres aproximacions de l'estat de l'art com per exemple: contradiccions de classificacions independents,
falta de context i falta d'informació. A més a més, l'aproximació presentada permet incorporar informació afegint restriccions i s'ha
fet recerca per aconseguir afegir informació del mon que millori els resultats.
RelaxCor, el sistema que ha estat implementat durant la tesi per experimentar amb l'aproximació proposada, ha aconseguit uns
resultats comparables als millors que hi ha a l'estat de l'art. S'ha participat a les competicions internacionals SemEval-2010 i
CoNLL-2011. RelaxCor va obtenir la segona posició al CoNLL-2010
- …