223 research outputs found
Modeling Dynamic Heterogeneous Graph and Node Importance for Future Citation Prediction
Accurate citation count prediction of newly published papers could help
editors and readers rapidly figure out the influential papers in the future.
Though many approaches are proposed to predict a paper's future citation, most
ignore the dynamic heterogeneous graph structure or node importance in academic
networks. To cope with this problem, we propose a Dynamic heterogeneous Graph
and Node Importance network (DGNI) learning framework, which fully leverages
the dynamic heterogeneous graph and node importance information to predict
future citation trends of newly published papers. First, a dynamic
heterogeneous network embedding module is provided to capture the dynamic
evolutionary trends of the whole academic network. Then, a node importance
embedding module is proposed to capture the global consistency relationship to
figure out each paper's node importance. Finally, the dynamic evolutionary
trend embeddings and node importance embeddings calculated above are combined
to jointly predict the future citation counts of each paper, by a log-normal
distribution model according to multi-faced paper node representations.
Extensive experiments on two large-scale datasets demonstrate that our model
significantly improves all indicators compared to the SOTA models.Comment: Accepted by CIKM'202
Clustering and Community Detection in Directed Networks: A Survey
Networks (or graphs) appear as dominant structures in diverse domains,
including sociology, biology, neuroscience and computer science. In most of the
aforementioned cases graphs are directed - in the sense that there is
directionality on the edges, making the semantics of the edges non symmetric.
An interesting feature that real networks present is the clustering or
community structure property, under which the graph topology is organized into
modules commonly called communities or clusters. The essence here is that nodes
of the same community are highly similar while on the contrary, nodes across
communities present low similarity. Revealing the underlying community
structure of directed complex networks has become a crucial and
interdisciplinary topic with a plethora of applications. Therefore, naturally
there is a recent wealth of research production in the area of mining directed
graphs - with clustering being the primary method and tool for community
detection and evaluation. The goal of this paper is to offer an in-depth review
of the methods presented so far for clustering directed networks along with the
relevant necessary methodological background and also related applications. The
survey commences by offering a concise review of the fundamental concepts and
methodological base on which graph clustering algorithms capitalize on. Then we
present the relevant work along two orthogonal classifications. The first one
is mostly concerned with the methodological principles of the clustering
algorithms, while the second one approaches the methods from the viewpoint
regarding the properties of a good cluster in a directed network. Further, we
present methods and metrics for evaluating graph clustering results,
demonstrate interesting application domains and provide promising future
research directions.Comment: 86 pages, 17 figures. Physics Reports Journal (To Appear
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
Recommending on graphs: a comprehensive review from a data perspective
Recent advances in graph-based learning approaches have demonstrated their
effectiveness in modelling users' preferences and items' characteristics for
Recommender Systems (RSS). Most of the data in RSS can be organized into graphs
where various objects (e.g., users, items, and attributes) are explicitly or
implicitly connected and influence each other via various relations. Such a
graph-based organization brings benefits to exploiting potential properties in
graph learning (e.g., random walk and network embedding) techniques to enrich
the representations of the user and item nodes, which is an essential factor
for successful recommendations. In this paper, we provide a comprehensive
survey of Graph Learning-based Recommender Systems (GLRSs). Specifically, we
start from a data-driven perspective to systematically categorize various
graphs in GLRSs and analyze their characteristics. Then, we discuss the
state-of-the-art frameworks with a focus on the graph learning module and how
they address practical recommendation challenges such as scalability, fairness,
diversity, explainability and so on. Finally, we share some potential research
directions in this rapidly growing area.Comment: Accepted by UMUA
A Network Science and Document Similarity based Hybrid Job Recommendation System
Tööde soovitussüsteemid kasutavad erinevaid andmeallikaid lõppkasutajale parema sisu tagamiseks. Hästi toimiva soovitussüsteemi arendamine nõuab keerulisi hübriidseid lähenemisi sarnasuse kujutamisele põhinedes töökuulutuste ja resümeede sisudele ja nendevahelistele interaktsioonidele. Antud töö tulemina arendati efektiivne võrgul baseeruv töökohtade soovitussüsteem, mis kasutab Personalized PageRank algoritmi töökohtade järjestamiseks põhinedes tööotsija resümee ja töökuulutuse kui tekstiliste dokumentide sarnasustele ning eelnevatele kasutaja ja töökuulutuste vahelistele interaktsioonidele.Meie lähenemine saavutas 50%-lise saagise ja tekitas online A/B testi jooksul rohkem kandideerimisi kui eelmised algoritmid.Job recommendation systems mainly use different sources of data in order to give the better content for the end user. Developing the well-performing system requires complex hybrid approaches of representing similarity based on the content of job postings and resumes as well as interactions between them. We develop an efficient hybrid network-based job recommendation system which uses Personalized PageRank algorithm in order to rank vacancies for the users based on the similarity between resumes and job posts as textual documents, along with previous interactions of users with vacancies. Our approach achieved the recall of 50% and generated more applies for the jobs during the online A/B test than previous algorithms
Personalized PageRank on Evolving Graphs with an Incremental Index-Update Scheme
{\em Personalized PageRank (PPR)} stands as a fundamental proximity measure
in graph mining. Since computing an exact SSPPR query answer is prohibitive,
most existing solutions turn to approximate queries with guarantees. The
state-of-the-art solutions for approximate SSPPR queries are index-based and
mainly focus on static graphs, while real-world graphs are usually dynamically
changing. However, existing index-update schemes can not achieve a sub-linear
update time. Motivated by this, we present an efficient indexing scheme to
maintain indexed random walks in expected time after each graph update.
To reduce the space consumption, we further propose a new sampling scheme to
remove the auxiliary data structure for vertices while still supporting
index update cost on evolving graphs. Extensive experiments show that our
update scheme achieves orders of magnitude speed-up on update performance over
existing index-based dynamic schemes without sacrificing the query efficiency
- …